How to sum multiple columns together of a dataframe with pandas in python ?

Published: August 30, 2021

Tags: Python; Pandas; DataFrame;

DMCA.com Protection Status

Examples of how to add multiple columns together of a dataframe with pandas in python

Create a dataframe with pandas

import pandas as pd
import numpy as np

data = np.random.randint(100, size=(10,3))

df = pd.DataFrame(data=data,columns=['A','B','C'])

returns

    A   B   C
0  37  64  38
1  22  57  91
2  44  79  46
3   0  10   1
4  27   0  45
5  82  99  90
6  23  35  90
7  84  48  16
8  64  70  28
9  83  50   2

Sum all columns

To sum all columns of a dtaframe, a solution is to use sum()

df.sum(axis=1)

returns here

0    139
1    170
2    169
3     11
4     72
5    271
6    148
7    148
8    162
9    135

To create a new column in the dataframe with the sum of all columns:

df['(A+B+C)'] = df.sum(axis=1)

returns

dtype: int64
    A   B   C  (A+B+C)
0  37  64  38      139
1  22  57  91      170
2  44  79  46      169
3   0  10   1       11
4  27   0  45       72
5  82  99  90      271
6  23  35  90      148
7  84  48  16      148
8  64  70  28      162
9  83  50   2      135

Sum only given columns

To add only some columns, a solution is to create a list of columns that we want to sum together:

columns_list = ['B', 'C']

and do:

df['(B+C)'] = df[columns_list].sum(axis=1)

then returns

    A   B   C  (A+B+C)  (B+C)
0  37  64  38      139    102
1  22  57  91      170    148
2  44  79  46      169    125
3   0  10   1       11     11
4  27   0  45       72     45
5  82  99  90      271    189
6  23  35  90      148    125
7  84  48  16      148     64
8  64  70  28      162     98
9  83  50   2      135     52

Dataframe with columns of strings

Another example with a dataframe containing columns of integers and strings

import pandas as pd
import numpy as np

data = np.random.randint(100, size=(10,3))

df = pd.DataFrame(data=data,columns=['A','B','C'])

df['D'] = ['a','a','a','a','a','a','a','a','a','a']
df['E'] = ['b','b','b','b','b','b','b','b','b','b']

returns

    A   B   C  D  E
0  57  53  90  a  b
1  18  26  22  a  b
2  53  86  18  a  b
3  81  85  47  a  b
4  45  18  39  a  b
5  37  49  17  a  b
6  16  90  10  a  b
7  27  93  54  a  b
8  46   2  67  a  b
9   8  46  54  a  b

If we apply sum()

df.sum(axis=1)

then the function ONLY sum columns of integers:

0    200
1     66
2    157
3    213
4    102
5    103
6    116
7    174
8    115
9    108
dtype: int64

While if we apply sum() only on columns that contains strings:

df[['D','E']].sum(axis=1)

it will concatenate columns:

0    ab
1    ab
2    ab
3    ab
4    ab
5    ab
6    ab
7    ab
8    ab
9    ab
dtype: object

References