Examples of how to merge (concatenate) two columns into one with pandas:
Merge two or more dataframe columns of strings with pandas
Let's first create a dataframe with pandas
import pandas as pd
import numpy as np
data = {'First_Name':['April','Emory','David','Alice','Virginia'],
'Last_Name':['Reiter','Miller','Ballin','Trotter','Rios'],
'Middle_Name':['G.','','H.G','',''],
'Age':[42,24,12,32,56]}
df = pd.DataFrame(data=data)
print(df)
gives
First_Name Last_Name Middle_Name Age
0 April Reiter G. 42
1 Emory Miller 24
2 David Ballin H.G 12
3 Alice Trotter 32
4 Virginia Rios 56
Merge two columns of strings
To merge two columns of strings, a straightforward solution is to do:
df['First_Name'] + df['Last_Name']
gives then
0 AprilReiter
1 EmoryMiller
2 DavidBallin
3 AliceTrotter
4 VirginiaRios
dtype: object
To add a space:
df['First_Name'] + ' ' + df['Last_Name']
gives
0 April Reiter
1 Emory Miller
2 David Ballin
3 Alice Trotter
4 Virginia Rios
dtype: object
Another solution is to use pandas.DataFrame.agg:
df[['First_Name','Last_Name']].agg(' '.join, axis=1)
gives
0 April Reiter
1 Emory Miller
2 David Ballin
3 Alice Trotter
4 Virginia Rios
dtype: object
Another example, aggregating three columns
df[['First_Name','Middle_Name','Last_Name']].agg(' '.join, axis=1)
gives
0 April G. Reiter
1 Emory Miller
2 David H.G Ballin
3 Alice Trotter
4 Virginia Rios
dtype: object
Create a new Full_Name column:
df['Full_Name'] = df[['First_Name','Middle_Name','Last_Name']].agg(' '.join, axis=1)
print(df)
gives
First_Name Last_Name Middle_Name Age Full_Name
0 April Reiter G. 42 April G. Reiter
1 Emory Miller 24 Emory Miller
2 David Ballin H.G 12 David H.G Ballin
3 Alice Trotter 32 Alice Trotter
4 Virginia Rios 56 Virginia Rios
Note that some rows of the Full_Name column has two spaces. To fix that:
df['Full_Name'].str.replace(" "," ")
gives
0 April G. Reiter
1 Emory Miller
2 David H.G Ballin
3 Alice Trotter
4 Virginia Rios
Name: Full_Name, dtype: object
Merge a column of strings with a column of integers
To merge a column of strings with a column of integers it is necessary to first convert the numbers into a string. To do that a solution is to use astype():
df['Last_Name'] + ' ' + df['Age'].astype(str)
gives
0 Reiter 42
1 Miller 24
2 Ballin 12
3 Trotter 32
4 Rios 56
dtype: object
Another example using agg():
df[['Last_Name','Age']].apply(lambda x : x.astype(str)).agg(' '.join, axis=1)
gives
0 Reiter 42
1 Miller 24
2 Ballin 12
3 Trotter 32
4 Rios 56
dtype: object
Merge columns of numbers
Let's create a new dataframe
import pandas as pd
import numpy as np
data = {'First_Name':['April','Emory','David','Alice','Virginia'],
'Last_Name':['Reiter','Miller','Ballin','Trotter','Rios'],
'Middle_Name':['G.','','H.G','',''],
'Age':[42,24,12,32,56],
'Score':[2,10,5,3,10]}
df = pd.DataFrame(data=data)
print(df)
gives
First_Name Last_Name Middle_Name Age Score
0 April Reiter G. 42 2
1 Emory Miller 24 10
2 David Ballin H.G 12 5
3 Alice Trotter 32 3
4 Virginia Rios 56 10
Then if you do
print( df['Age'] + df['Score'] )
you will basically add the two columns together:
0 44
1 34
2 17
3 35
4 66
dtype: int64
To concatenate the two numbers:
print( df['Age'].astype(str) + ' -- '+ df['Score'].astype(str) )
gives
0 42 -- 2
1 24 -- 10
2 12 -- 5
3 32 -- 3
4 56 -- 10
dtype: object