How to convert a column of float (quantitative data) into categorical data with pandas using cut ?

Published: September 09, 2022

Tags: Python; Pandas; DataFrame;

DMCA.com Protection Status

Examples of how to convert quantitative data to categorical data with pandas using cut:

Create synthetic data

Let's first create some fake continuous data:

import random

l = [random.randint(0,100) for i in range(10)]

returns for example

[66, 44, 62, 99, 82, 13, 7, 58, 60, 38]

Save data in a pandas dataframe

import pandas as pd
import numpy as np

data = np.array(l)

df = pd.DataFrame(data,columns=['x'])

print(df)

returns

    x
0  66
1  44
2  62
3  99
4  82
5  13
6   7
7  58
8  60
9  38

Aggregate

To convert numeric data to categorical data, a solution with pandas is to use cut | pandas.pydata.org

pd.cut(df['x'], [0,25,50,75,100], labels=['A', 'B', 'C', 'D'])

returns

0    C
1    B
2    C
3    D
4    D
5    A
6    A
7    C
8    C
9    B
Name: x, dtype: category
Categories (4, object): ['A' < 'B' < 'C' < 'D']

Create a new column:

df['Cx'] = pd.cut(df['x'], [0,25,50,75,100], labels=['A', 'B', 'C', 'D'])

print(df)

returns

    x Cx
0  66  C
1  44  B
2  62  C
3  99  D
4  82  D
5  13  A
6   7  A
7  58  C
8  60  C
9  38  B

References

Links Site
Group by of a float column using pandas stackoverflow
pandas.cut pandas.pydata.org