Examples of how to find all unique values in a dataframe column with pandas
Create a dataframe with pandas
Let's consider the following dataframe
import pandas as pd
data = {'custumer id':['001','002','002','002','003','003','004','005','006'],
'custumer name':['Ben','Anna','Anna','Anna','Zoe','Zoe','Tom','John','Steve']}
df = pd.DataFrame(data)
gives
custumer id custumer name
0 001 Ben
1 002 Anna
2 002 Anna
3 002 Anna
4 003 Zoe
5 003 Zoe
6 004 Tom
7 005 John
8 006 Steve
Find all unique values in the column called 'custumer id'
To find all all unique values in the column called 'custumer id', a solution is to use the pandas function unique
df['custumer id'].unique()
returns in this example:
array(['001', '002', '003', '004', '005', '006'], dtype=object)
Find all unique values with groupby()
Another example of dataframe:
import pandas as pd
data = {'custumer_id':['001','001','002','003','004','004','005','005','007'],
'household_id':['001','001','001','001','002','002','003','003','003']}
df = pd.DataFrame(data)
print(df)
returns
custumer_id household_id
0 001 001
1 001 001
2 002 001
3 003 001
4 004 002
5 004 002
6 005 003
7 005 003
8 007 003
To fond all unique values of 'custumer_id' by 'household_id' a solution is to do
df.groupby('household_id')['custumer_id'].unique()
gives
household_id
001 [001, 002, 003]
002 [004]
003 [005, 007]
Name: custumer_id, dtype: object