Examples of how to find all unique values in a dataframe column with pandas
Create a dataframe with pandas
Let's consider the following dataframe
import pandas as pddata = {'custumer id':['001','002','002','002','003','003','004','005','006'],'custumer name':['Ben','Anna','Anna','Anna','Zoe','Zoe','Tom','John','Steve']}df = pd.DataFrame(data)
gives
custumer id custumer name0 001 Ben1 002 Anna2 002 Anna3 002 Anna4 003 Zoe5 003 Zoe6 004 Tom7 005 John8 006 Steve
Find all unique values in the column called 'custumer id'
To find all all unique values in the column called 'custumer id', a solution is to use the pandas function unique
df['custumer id'].unique()
returns in this example:
array(['001', '002', '003', '004', '005', '006'], dtype=object)
Find all unique values with groupby()
Another example of dataframe:
import pandas as pddata = {'custumer_id':['001','001','002','003','004','004','005','005','007'],'household_id':['001','001','001','001','002','002','003','003','003']}df = pd.DataFrame(data)print(df)
returns
custumer_id household_id0 001 0011 001 0012 002 0013 003 0014 004 0025 004 0026 005 0037 005 0038 007 003
To fond all unique values of 'custumer_id' by 'household_id' a solution is to do
df.groupby('household_id')['custumer_id'].unique()
gives
household_id001 [001, 002, 003]002 [004]003 [005, 007]Name: custumer_id, dtype: object
