Examples of how to convert quantitative data to categorical data with pandas using cut:
Table of contents
Create synthetic data
Let's first create some fake continuous data:
import randoml = [random.randint(0,100) for i in range(10)]
returns for example
[66, 44, 62, 99, 82, 13, 7, 58, 60, 38]
Save data in a pandas dataframe
import pandas as pdimport numpy as npdata = np.array(l)df = pd.DataFrame(data,columns=['x'])print(df)
returns
x0 661 442 623 994 825 136 77 588 609 38
Aggregate
To convert numeric data to categorical data, a solution with pandas is to use cut | pandas.pydata.org
pd.cut(df['x'], [0,25,50,75,100], labels=['A', 'B', 'C', 'D'])
returns
0 C1 B2 C3 D4 D5 A6 A7 C8 C9 BName: x, dtype: categoryCategories (4, object): ['A' < 'B' < 'C' < 'D']
Create a new column:
df['Cx'] = pd.cut(df['x'], [0,25,50,75,100], labels=['A', 'B', 'C', 'D'])print(df)
returns
x Cx0 66 C1 44 B2 62 C3 99 D4 82 D5 13 A6 7 A7 58 C8 60 C9 38 B
References
| Links | Site |
|---|---|
| Group by of a float column using pandas | stackoverflow |
| pandas.cut | pandas.pydata.org |
