Examples of how to count the occurrence of an element in a pandas data frame column:
Table of contents
Using value_counts()
Lets take for example the file 'default of credit card clients Data Set" that can be downloaded here
>>> import pandas as pd>>> df = pd.read_excel('default of credit card clients.xls', header=1)
To get the count of default payment a solution is to use value_counts():
>>> df['default payment next month'].value_counts()0 233641 6636Name: default payment next month, dtype: int64
Example with the column sex:
>>> df['SEX'].value_counts()2 181121 11888Name: SEX, dtype: int64
Example with the column age:
>>> df['AGE'].value_counts()29 160527 147728 140930 139526 125631 121725 118634 116232 115833 114624 112735 111336 110837 104139 95438 94423 93140 87041 82442 79444 70043 67045 61746 57022 56047 50148 46649 45250 41151 34053 32552 30454 24755 20956 17858 12257 12259 8360 6721 6761 5662 4463 3164 3166 2565 2467 1669 1570 1068 573 471 372 375 374 179 1Name: AGE, dtype: int64
It is also possible to define bins:
>>> df['AGE'].value_counts(bins=10)26.800 826132.600 651420.942 512738.400 481244.200 301750.000 142555.800 62861.600 17167.400 4073.200 5
and to personalize the bin range:
Name: AGE, dtype: int64>>> df['AGE'].value_counts(bins=[0,10,20,25,30,35,40,60,80])40 800225 714230 579635 491720 387160 27210 00 0Name: AGE, dtype: int64
Column with missing data
If we add some missing data:
>>> import pandas as pd>>> import numpy as np>>> df = pd.read_excel('default of credit card clients.xls', header=1)>>> df.iloc[[2,7,4,99,10,130],:] = np.nan
and then apply value_counts(), the missing data are automatically discards:
>>> df['SEX'].value_counts()2.0 181081.0 11886Name: SEX, dtype: int64
To get missing data count just add dropna=False:
>>> df['SEX'].value_counts(dropna=False)2.0 181081.0 11886NaN 6Name: SEX, dtype: int64
References
| Links | Site |
|---|---|
| value_counts() | pandas doc |
| count the frequency that a value occurs in a dataframe column | stackoverflow |
| pandas.DataFrame.count | pandas doc |
