How to extract the value names and counts from value_counts() in pandas ?


Example of how to extract the value names and counts from value_counts() in pandas:

Create a dataframe with pandas

Lets create a simple dataframe with pandas:

import pandas as pd
import numpy as np
import random

Surface = [random.choice(['Ocean','Snow','Desert', 'Forest','']) for i in range(20)]
c1 = np.random.uniform(0,1, size=20)
c2 = np.random.uniform(0,1, size=20)
c3 = np.random.uniform(0,1, size=20)

data = {'Surface':Surface,
        'c1':c1, 
        'c2':c2, 
        'c3':c3}

df = pd.DataFrame(data)

print(df)

returns for example

   Surface        c1        c2        c3
0   Forest  0.273228  0.547091  0.995114
1   Desert  0.585434  0.229137  0.858714
2           0.215637  0.554016  0.347851
3   Desert  0.684748  0.345694  0.292525
4    Ocean  0.683991  0.916057  0.943469
5   Desert  0.342695  0.024518  0.226359
6     Snow  0.702004  0.520701  0.194929
7   Desert  0.812882  0.978291  0.667365
8    Ocean  0.072272  0.038988  0.391696
9     Snow  0.948763  0.622616  0.298193
10          0.438773  0.024421  0.133489
11  Forest  0.277632  0.775055  0.162494
12    Snow  0.576751  0.840382  0.784162
13   Ocean  0.675573  0.500358  0.968885
14   Ocean  0.717382  0.485487  0.209018
15  Desert  0.473825  0.992767  0.383084
16  Desert  0.918840  0.545654  0.034444
17    Snow  0.820191  0.414088  0.315009
18          0.086068  0.724317  0.009183
19   Ocean  0.099506  0.849096  0.431591

Use value_counts() on the column named 'Surface'

Example of results using value_counts() on the column named 'Surface':

df['Surface'].value_counts()

returns

Desert    6
Ocean     5
Snow      4
          3
Forest    2
Name: Surface, dtype: int64

Get the list of names from value_counts()

To get the list of names from value_counts(), a solution is to use tolist():

df['Surface'].value_counts().index.tolist()

returns:

['Desert', 'Ocean', 'Snow', '', 'Forest']

Get the name of the first item in value_counts()

Example on how to get the name of the first item:

df['Surface'].value_counts().index.tolist()[0]

returns

'Desert'

Get the count value of the first item in value_counts()

Example of how to get the count value of the first item :

df['Surface'].value_counts()[0]

returns

6

Create a loop over value_counts() items

Let now create a simple for loop that goes through all items in value_counts():

for idx,name in enumerate(df['Surface'].value_counts().index.tolist()):
    print('Name :', name)
    print('Counts :', df['Surface'].value_counts()[idx])

returns

Name : Desert
Counts : 6
Name : Ocean
Counts : 5
Name : Snow
Counts : 4
Name : 
Counts : 3
Name : Forest
Counts : 2

Create an histogram with matplotlib

Note: it is also possible to create a simple histogram from value_counts():

import matplotlib.pyplot as plt

df['Surface'].value_counts().plot(kind='bar')

plt.show()

References