How to find all unique values in a dataframe column with pandas ?


Examples of how to find all unique values in a dataframe column with pandas

Create a dataframe with pandas

Let's consider the following dataframe

import pandas as pd

data = {'custumer id':['001','002','002','002','003','003','004','005','006'], 
        'custumer name':['Ben','Anna','Anna','Anna','Zoe','Zoe','Tom','John','Steve']}


df = pd.DataFrame(data)

gives

  custumer id custumer name
0         001           Ben
1         002          Anna
2         002          Anna
3         002          Anna
4         003           Zoe
5         003           Zoe
6         004           Tom
7         005          John
8         006         Steve

Find all unique values in the column called 'custumer id'

To find all all unique values in the column called 'custumer id', a solution is to use the pandas function unique

df['custumer id'].unique()

returns in this example:

array(['001', '002', '003', '004', '005', '006'], dtype=object)

Find all unique values with groupby()

Another example of dataframe:

import pandas as pd

data = {'custumer_id':['001','001','002','003','004','004','005','005','007'], 
        'household_id':['001','001','001','001','002','002','003','003','003']}


df = pd.DataFrame(data)

print(df)

returns

  custumer_id household_id
0         001          001
1         001          001
2         002          001
3         003          001
4         004          002
5         004          002
6         005          003
7         005          003
8         007          003

To fond all unique values of 'custumer_id' by 'household_id' a solution is to do

df.groupby('household_id')['custumer_id'].unique()

gives

household_id
001    [001, 002, 003]
002              [004]
003         [005, 007]
Name: custumer_id, dtype: object

References