How to get the max value of two or more columns in a pandas dataframe ?


An example of how to get the max value of two or more columns in a pandas dataframe ?

Create a simple dataframe with pandas

Let's create a simple dataframe

import pandas as pd
import numpy as np
import random

a = [random.choice(['a','b','c']) for i in range(20)]
c1 = np.random.uniform(0,1, size=20)
c2 = np.random.uniform(0,1, size=20)
c3 = np.random.uniform(0,1, size=20)

data = {'a':a,
        'c1':c1, 
        'c2':c2, 
        'c3':c3}

df = pd.DataFrame(data)

print(df)

returns for example

    a        c1        c2        c3
0   b  0.480835  0.108703  0.062893
1   b  0.469707  0.441162  0.584776
2   a  0.942992  0.254551  0.750689
3   c  0.634768  0.048194  0.810934
4   b  0.185650  0.551316  0.126457
5   c  0.661850  0.178760  0.321801
6   c  0.739558  0.878052  0.196450
7   a  0.344361  0.267702  0.401820
8   a  0.430181  0.300013  0.674959
9   b  0.412139  0.295223  0.799033
10  a  0.101919  0.810593  0.665577
11  c  0.036506  0.705999  0.183860
12  a  0.832732  0.288746  0.994192
13  b  0.474459  0.574548  0.539348
14  b  0.322733  0.015939  0.142288
15  c  0.108360  0.447548  0.474686
16  c  0.111654  0.048010  0.651970
17  c  0.360656  0.155344  0.854249
18  a  0.404405  0.034052  0.618728
19  b  0.234892  0.052290  0.889191

Find the max value for each row

To get the max value between the columns ['c1','c2','c3'] a solution is to use pandas.DataFrame.max:

df[['c1','c2','c3']].max(axis=1)

returns

0     0.480835
1     0.584776
2     0.942992
3     0.810934
4     0.551316
5     0.661850
6     0.878052
7     0.401820
8     0.674959
9     0.799033
10    0.810593
11    0.705999
12    0.994192
13    0.574548
14    0.322733
15    0.474686
16    0.651970
17    0.854249
18    0.618728
19    0.889191
dtype: float64

To append a new column to the dataframe with the max value:

df['max value'] = df[['c1','c2','c3']].max(axis=1)

returns

    a        c1        c2        c3  max value
0   b  0.480835  0.108703  0.062893   0.480835
1   b  0.469707  0.441162  0.584776   0.584776
2   a  0.942992  0.254551  0.750689   0.942992
3   c  0.634768  0.048194  0.810934   0.810934
4   b  0.185650  0.551316  0.126457   0.551316
5   c  0.661850  0.178760  0.321801   0.661850
6   c  0.739558  0.878052  0.196450   0.878052
7   a  0.344361  0.267702  0.401820   0.401820
8   a  0.430181  0.300013  0.674959   0.674959
9   b  0.412139  0.295223  0.799033   0.799033
10  a  0.101919  0.810593  0.665577   0.810593
11  c  0.036506  0.705999  0.183860   0.705999
12  a  0.832732  0.288746  0.994192   0.994192
13  b  0.474459  0.574548  0.539348   0.574548
14  b  0.322733  0.015939  0.142288   0.322733
15  c  0.108360  0.447548  0.474686   0.474686
16  c  0.111654  0.048010  0.651970   0.651970
17  c  0.360656  0.155344  0.854249   0.854249
18  a  0.404405  0.034052  0.618728   0.618728
19  b  0.234892  0.052290  0.889191   0.889191

Find the name of the column that contains the max value

To get the name of the column that contains the max value, a solution is to use pandas.DataFrame.idxmax

df[['c1','c2','c3']].idxmax(axis=1)

returns

0     c1
1     c3
2     c1
3     c3
4     c2
5     c1
6     c2
7     c3
8     c3
9     c3
10    c2
11    c2
12    c3
13    c2
14    c1
15    c3
16    c3
17    c3
18    c3
19    c3
dtype: object

Create a new column to store the results:

df['col name'] = df[['c1','c2','c3']].idxmax(axis=1)

returns

    a        c1        c2        c3  max value col name
0   b  0.480835  0.108703  0.062893   0.480835       c1
1   b  0.469707  0.441162  0.584776   0.584776       c3
2   a  0.942992  0.254551  0.750689   0.942992       c1
3   c  0.634768  0.048194  0.810934   0.810934       c3
4   b  0.185650  0.551316  0.126457   0.551316       c2
5   c  0.661850  0.178760  0.321801   0.661850       c1
6   c  0.739558  0.878052  0.196450   0.878052       c2
7   a  0.344361  0.267702  0.401820   0.401820       c3
8   a  0.430181  0.300013  0.674959   0.674959       c3
9   b  0.412139  0.295223  0.799033   0.799033       c3
10  a  0.101919  0.810593  0.665577   0.810593       c2
11  c  0.036506  0.705999  0.183860   0.705999       c2
12  a  0.832732  0.288746  0.994192   0.994192       c3
13  b  0.474459  0.574548  0.539348   0.574548       c2
14  b  0.322733  0.015939  0.142288   0.322733       c1
15  c  0.108360  0.447548  0.474686   0.474686       c3
16  c  0.111654  0.048010  0.651970   0.651970       c3
17  c  0.360656  0.155344  0.854249   0.854249       c3
18  a  0.404405  0.034052  0.618728   0.618728       c3
19  b  0.234892  0.052290  0.889191   0.889191       c3

References