Examples of how to merge (concatenate) two columns into one with pandas:
Merge two or more dataframe columns of strings with pandas
Let's first create a dataframe with pandas
import pandas as pdimport numpy as npdata = {'First_Name':['April','Emory','David','Alice','Virginia'],'Last_Name':['Reiter','Miller','Ballin','Trotter','Rios'],'Middle_Name':['G.','','H.G','',''],'Age':[42,24,12,32,56]}df = pd.DataFrame(data=data)print(df)
gives
First_Name Last_Name Middle_Name Age0 April Reiter G. 421 Emory Miller 242 David Ballin H.G 123 Alice Trotter 324 Virginia Rios 56
Merge two columns of strings
To merge two columns of strings, a straightforward solution is to do:
df['First_Name'] + df['Last_Name']
gives then
0 AprilReiter1 EmoryMiller2 DavidBallin3 AliceTrotter4 VirginiaRiosdtype: object
To add a space:
df['First_Name'] + ' ' + df['Last_Name']
gives
0 April Reiter1 Emory Miller2 David Ballin3 Alice Trotter4 Virginia Riosdtype: object
Another solution is to use pandas.DataFrame.agg:
df[['First_Name','Last_Name']].agg(' '.join, axis=1)
gives
0 April Reiter1 Emory Miller2 David Ballin3 Alice Trotter4 Virginia Riosdtype: object
Another example, aggregating three columns
df[['First_Name','Middle_Name','Last_Name']].agg(' '.join, axis=1)
gives
0 April G. Reiter1 Emory Miller2 David H.G Ballin3 Alice Trotter4 Virginia Riosdtype: object
Create a new Full_Name column:
df['Full_Name'] = df[['First_Name','Middle_Name','Last_Name']].agg(' '.join, axis=1)print(df)
gives
First_Name Last_Name Middle_Name Age Full_Name0 April Reiter G. 42 April G. Reiter1 Emory Miller 24 Emory Miller2 David Ballin H.G 12 David H.G Ballin3 Alice Trotter 32 Alice Trotter4 Virginia Rios 56 Virginia Rios
Note that some rows of the Full_Name column has two spaces. To fix that:
df['Full_Name'].str.replace(" "," ")
gives
0 April G. Reiter1 Emory Miller2 David H.G Ballin3 Alice Trotter4 Virginia RiosName: Full_Name, dtype: object
Merge a column of strings with a column of integers
To merge a column of strings with a column of integers it is necessary to first convert the numbers into a string. To do that a solution is to use astype():
df['Last_Name'] + ' ' + df['Age'].astype(str)
gives
0 Reiter 421 Miller 242 Ballin 123 Trotter 324 Rios 56dtype: object
Another example using agg():
df[['Last_Name','Age']].apply(lambda x : x.astype(str)).agg(' '.join, axis=1)
gives
0 Reiter 421 Miller 242 Ballin 123 Trotter 324 Rios 56dtype: object
Merge columns of numbers
Let's create a new dataframe
import pandas as pdimport numpy as npdata = {'First_Name':['April','Emory','David','Alice','Virginia'],'Last_Name':['Reiter','Miller','Ballin','Trotter','Rios'],'Middle_Name':['G.','','H.G','',''],'Age':[42,24,12,32,56],'Score':[2,10,5,3,10]}df = pd.DataFrame(data=data)print(df)
gives
First_Name Last_Name Middle_Name Age Score0 April Reiter G. 42 21 Emory Miller 24 102 David Ballin H.G 12 53 Alice Trotter 32 34 Virginia Rios 56 10
Then if you do
print( df['Age'] + df['Score'] )
you will basically add the two columns together:
0 441 342 173 354 66dtype: int64
To concatenate the two numbers:
print( df['Age'].astype(str) + ' -- '+ df['Score'].astype(str) )
gives
0 42 -- 21 24 -- 102 12 -- 53 32 -- 34 56 -- 10dtype: object
