Example of how to replace dataframe row missing (NaN) values using previous row values with pandas:
Create a dataframe with NaN values
Let's first create a dataframe with pandas with missing values:
import pandas as pdimport numpy as npdata = np.random.randint(100, size=(10,3))df = pd.DataFrame(data=data,columns=['A','B','C'])df.iloc[2,0:2] = np.nan
gives
A B C0 16.0 4.0 901 78.0 16.0 12 NaN NaN 943 1.0 49.0 84 88.0 13.0 685 56.0 4.0 406 36.0 27.0 827 34.0 37.0 648 6.0 38.0 559 98.0 32.0 39
Replacing missing value using ffill
To fill dataframe row missing (NaN) values using previous row values with pandas, a solution is to use pandas.DataFrame.ffill:
df.ffill(inplace=True)
gives
A B C0 16.0 4.0 901 78.0 16.0 12 78.0 16.0 943 1.0 49.0 84 88.0 13.0 685 56.0 4.0 406 36.0 27.0 827 34.0 37.0 648 6.0 38.0 559 98.0 32.0 39
Note: that missing values have been replaced by the values from the row just above.
Replacing multiple consequtive rows with missing values
Another example with multiple consequtive rows with missing values
import pandas as pdimport numpy as npdata = np.random.randint(100, size=(10,3))df = pd.DataFrame(data=data,columns=['A','B','C'])df.iloc[2,0:2] = np.nandf.iloc[3,1:2] = np.nandf.iloc[4,0:2] = np.nandf.iloc[5,1:3] = np.nan
gives
A B C0 83.0 0.0 50.01 27.0 29.0 18.02 NaN NaN 89.03 82.0 NaN 37.04 NaN NaN 76.05 42.0 NaN NaN6 0.0 78.0 80.07 38.0 50.0 69.08 31.0 93.0 77.09 36.0 74.0 83.0
Then
df.ffill(inplace=True)
gives
A B C0 83.0 0.0 50.01 27.0 29.0 18.02 27.0 29.0 89.03 82.0 29.0 37.04 82.0 29.0 76.05 42.0 29.0 76.06 0.0 78.0 80.07 38.0 50.0 69.08 31.0 93.0 77.09 36.0 74.0 83.0
Replacing missing value using with DataFrame.fillna()
Note: ffill() is synonym for DataFrame.fillna() with method='ffill'.
df.fillna(method='ffill')
also gives
A B C0 16.0 4.0 901 78.0 16.0 12 78.0 16.0 943 1.0 49.0 84 88.0 13.0 685 56.0 4.0 406 36.0 27.0 827 34.0 37.0 648 6.0 38.0 559 98.0 32.0 39
