Examples of how to add multiple columns together of a dataframe with pandas in python
Create a dataframe with pandas
import pandas as pdimport numpy as npdata = np.random.randint(100, size=(10,3))df = pd.DataFrame(data=data,columns=['A','B','C'])
returns
A B C0 37 64 381 22 57 912 44 79 463 0 10 14 27 0 455 82 99 906 23 35 907 84 48 168 64 70 289 83 50 2
Sum all columns
To sum all columns of a dtaframe, a solution is to use sum()
df.sum(axis=1)
returns here
0 1391 1702 1693 114 725 2716 1487 1488 1629 135
To create a new column in the dataframe with the sum of all columns:
df['(A+B+C)'] = df.sum(axis=1)
returns
dtype: int64A B C (A+B+C)0 37 64 38 1391 22 57 91 1702 44 79 46 1693 0 10 1 114 27 0 45 725 82 99 90 2716 23 35 90 1487 84 48 16 1488 64 70 28 1629 83 50 2 135
Sum only given columns
To add only some columns, a solution is to create a list of columns that we want to sum together:
columns_list = ['B', 'C']
and do:
df['(B+C)'] = df[columns_list].sum(axis=1)
then returns
A B C (A+B+C) (B+C)0 37 64 38 139 1021 22 57 91 170 1482 44 79 46 169 1253 0 10 1 11 114 27 0 45 72 455 82 99 90 271 1896 23 35 90 148 1257 84 48 16 148 648 64 70 28 162 989 83 50 2 135 52
Dataframe with columns of strings
Another example with a dataframe containing columns of integers and strings
import pandas as pdimport numpy as npdata = np.random.randint(100, size=(10,3))df = pd.DataFrame(data=data,columns=['A','B','C'])df['D'] = ['a','a','a','a','a','a','a','a','a','a']df['E'] = ['b','b','b','b','b','b','b','b','b','b']
returns
A B C D E0 57 53 90 a b1 18 26 22 a b2 53 86 18 a b3 81 85 47 a b4 45 18 39 a b5 37 49 17 a b6 16 90 10 a b7 27 93 54 a b8 46 2 67 a b9 8 46 54 a b
If we apply sum()
df.sum(axis=1)
then the function ONLY sum columns of integers:
0 2001 662 1573 2134 1025 1036 1167 1748 1159 108dtype: int64
While if we apply sum() only on columns that contains strings:
df[['D','E']].sum(axis=1)
it will concatenate columns:
0 ab1 ab2 ab3 ab4 ab5 ab6 ab7 ab8 ab9 abdtype: object
