How to reset the index of a pandas DataFrame ?


One commonly needed task when working with pandas DataFrames is to reset the index of a DataFrame. This can be done easily with the reset_index() function.

Create a dataframe

To start, let's generate a DataFrame using synthetic data:

    import pandas as pd
import numpy as np

data = np.random.randint(100, size=(20,2))

df = pd.DataFrame(data=data,columns=['A','B'])

Example of output:

     A   B
0   56  43
1   52  38
2   78  33
3   57  79
4   80  13
5   14  20
6   79  27
7   11  49
8   68  44
9    7  67
10  61  39
11  46   4
12  94  78
13  38   2
14  29  29
15  34  14
16  18  66
17  11  63
18  30  85
19   3  21

Note that DataFrame index (row labels) are located here above in the first column.

Now lets create for example a sample

df = df.sample(5)

gives

      A   B
15  34  14
17  11  63
8   68  44
11  46   4
0   56  43

One can see here that the sample has random index as expected.

Reset dataframe index (case 1)

To reset dataframe index a solution is to use pandas.DataFrame.reset_index:

df = df.reset_index()

Output

   index   A   B
0     15  34  14
1     17  11  63
2      8  68  44
3     11  46   4
4      0  56  43

Note that reset_index() also create a new column called index that stored the previous index.

Reset dataframe index (case 2)

If you do not want this extra column just re-create a new sample:

data = np.random.randint(100, size=(20,2))

df = pd.DataFrame(data=data,columns=['A','B'])

df = df.sample(5)

Output

     A   B
17  88  80
14  10  25
19  23  26
8   32  45
4   40  51

In order to reset the index of the pandas DataFrame "df" with its default values and drop the existing index column, the following code could be used: df.reset_index(drop=True):

df = df.reset_index(drop=True)

Output:

    A   B
0  88  80
1  10  25
2  23  26
3  32  45
4  40  51

Reset dataframe index (case 3)

To start the index at 1 instead of 0, a solution is then to do:

df.index = df.index + 1

print(df)

Output

    A   B
1  88  80
2  10  25
3  23  26
4  32  45
5  40  51

or to add any Incremental numbers:

df.index = df.index + 200

print(df)

Output

      A   B
201  88  80
202  10  25
203  23  26
204  32  45
205  40  51

Apply a function to dataframe index

To apply a function to dataframe index, a solution is to do:

df = df.reset_index()

df['index'] = df['index'].apply(np.sqrt)

df.index = df['index']

df.drop(['index'], axis=1, inplace=True)

Output

            A   B
index            
14.177447  88  80
14.212670  10  25
14.247807  23  26
14.282857  32  45
14.317821  40  51

Use an existing dataframe column as index

Another solution is to use an existing dataframe column as index with pandas.DataFrame.reset_index:

data = np.random.randint(100, size=(20,2))

df = pd.DataFrame(data=data,columns=['A','B'])

df = df.set_index('A')

returns for example

     B
A     
30  35
73  48
27  64
39  99
26  43
98  99
5   86
36  75
41  86
32  59
15  84
29  12
99  58
58   7
87  86
12  90
18   8
46  82
2   28
60  17

References

Links Site
pandas.DataFrame.reset_index pandas.pydata.org
pandas.DataFrame.set_index pandas.pydata.org
pandas.DataFrame.drop pandas.pydata.org