How to check if two columns are equal (identical) with pandas ?

Published: March 03, 2021

Tags: Python; Pandas; DataFrame;

DMCA.com Protection Status

Examples of how to check if two columns are equal with pandas:

Create a dataframe with pandas

Let's create a dataframe with pandas

import pandas as pd
import numpy as np

data = np.random.randint(10, size=(5,2))

columns = ['Score A','Score B']

df = pd.DataFrame(data=data,columns=columns)

data = np.random.randint(10, size=(5,1))

df['Score C'] = pd.DataFrame(data=data)
df['Score D'] = pd.DataFrame(data=data)

print(df)

returns for example

   Score A  Score B  Score C  Score D
0        5        4        7        7
1        5        9        7        7
2        1        2        6        6
3        5        2        5        5
4        4        4        4        4

Check if two columns are equal

To check if two columns are equal a solution is to use pandas.DataFrame.equals, example:

df['Score A'].equals(df['Score B'])

retruns

False

Note: that the following line is the same that above:

df.iloc[:,0].equals(df.iloc[:,1])

returns as well:

False

If we check for columns 'Score C' and 'Score D'

df['Score C'].equals(df['Score D'])

we found that columns are equal:

True

Same if we do:

df['Score A'].equals(df['Score A'])

returns:

True

Compare two columns

If you want to compare two columns elementwise, a solution is to do:

df = df.copy()

df['Diff'] = np.where( df['Score A'] == df['Score B'] , '1', '0')

print(df)

returns:

   Score A  Score B  Score C  Score D Diff
0        5        4        7        7    0
1        5        9        7        7    0
2        1        2        6        6    0
3        5        2        5        5    0
4        4        4        4        4    1

here we added a column called diff (for difference) where 1 means same value in " Score A " and " Score B" else 0.

References