One commonly needed task when working with pandas DataFrames is to reset the index of a DataFrame. This can be done easily with the reset_index() function.
Create a dataframe
To start, let's generate a DataFrame using synthetic data:
import pandas as pdimport numpy as npdata = np.random.randint(100, size=(20,2))df = pd.DataFrame(data=data,columns=['A','B'])
Example of output:
A B0 56 431 52 382 78 333 57 794 80 135 14 206 79 277 11 498 68 449 7 6710 61 3911 46 412 94 7813 38 214 29 2915 34 1416 18 6617 11 6318 30 8519 3 21
Note that DataFrame index (row labels) are located here above in the first column.
Now lets create for example a sample
df = df.sample(5)
gives
A B15 34 1417 11 638 68 4411 46 40 56 43
One can see here that the sample has random index as expected.
Reset dataframe index (case 1)
To reset dataframe index a solution is to use pandas.DataFrame.reset_index:
df = df.reset_index()
Output
index A B0 15 34 141 17 11 632 8 68 443 11 46 44 0 56 43
Note that reset_index() also create a new column called index that stored the previous index.
Reset dataframe index (case 2)
If you do not want this extra column just re-create a new sample:
data = np.random.randint(100, size=(20,2))df = pd.DataFrame(data=data,columns=['A','B'])df = df.sample(5)
Output
A B17 88 8014 10 2519 23 268 32 454 40 51
In order to reset the index of the pandas DataFrame "df" with its default values and drop the existing index column, the following code could be used: df.reset_index(drop=True):
df = df.reset_index(drop=True)
Output:
A B0 88 801 10 252 23 263 32 454 40 51
Reset dataframe index (case 3)
To start the index at 1 instead of 0, a solution is then to do:
df.index = df.index + 1print(df)
Output
A B1 88 802 10 253 23 264 32 455 40 51
or to add any Incremental numbers:
df.index = df.index + 200print(df)
Output
A B201 88 80202 10 25203 23 26204 32 45205 40 51
Apply a function to dataframe index
To apply a function to dataframe index, a solution is to do:
df = df.reset_index()df['index'] = df['index'].apply(np.sqrt)df.index = df['index']df.drop(['index'], axis=1, inplace=True)
Output
A Bindex14.177447 88 8014.212670 10 2514.247807 23 2614.282857 32 4514.317821 40 51
Use an existing dataframe column as index
Another solution is to use an existing dataframe column as index with pandas.DataFrame.reset_index:
data = np.random.randint(100, size=(20,2))df = pd.DataFrame(data=data,columns=['A','B'])df = df.set_index('A')
returns for example
BA30 3573 4827 6439 9926 4398 995 8636 7541 8632 5915 8429 1299 5858 787 8612 9018 846 822 2860 17
References
| Links | Site |
|---|---|
| pandas.DataFrame.reset_index | pandas.pydata.org |
| pandas.DataFrame.set_index | pandas.pydata.org |
| pandas.DataFrame.drop | pandas.pydata.org |
