Examples of how to create a datetime column from year, month and day columns in pandas:
Create a dataframe
Let's first create a dataframe in pandas with 3 columns "Year", "Month" and "Day":
import pandas as pd
import random
import itertools
year_list = []
month_list = []
day_list = []
for combination in itertools.product([2015], [1,2,3,4], [5,15,25]):
#print(combination)
year_list.append( combination[0] )
month_list.append( combination[1] )
day_list.append( combination[2] )
data = {'Year':year_list,
'Month':month_list,
'Day':day_list,
'Random Integer':[random.randrange(1, 10) for i in range(len(year_list))]}
df = pd.DataFrame(data=data)
print(df)
returns for example
Year Month Day Random Integer
0 2015 1 5 7
1 2015 1 15 5
2 2015 1 25 6
3 2015 2 5 4
4 2015 2 15 1
5 2015 2 25 2
6 2015 3 5 4
7 2015 3 15 9
8 2015 3 25 3
9 2015 4 5 1
10 2015 4 15 3
11 2015 4 25 7
Create a datetime column
To create a new datetime column using 'Year', 'Month' and 'Day' columns, a solution is to use to_datetime():
df['Datetime'] = pd.to_datetime( df[['Year', 'Month', 'Day']])
returns
Year Month Day Random Integer Datetime
0 2015 1 5 7 2015-01-05
1 2015 1 15 5 2015-01-15
2 2015 1 25 6 2015-01-25
3 2015 2 5 4 2015-02-05
4 2015 2 15 1 2015-02-15
5 2015 2 25 2 2015-02-25
6 2015 3 5 4 2015-03-05
7 2015 3 15 9 2015-03-15
8 2015 3 25 3 2015-03-25
9 2015 4 5 1 2015-04-05
10 2015 4 15 3 2015-04-15
11 2015 4 25 7 2015-04-25
Set datetime column as the dataframe index
Now, if you want to set the datetime column as the dataframe index,a solution is to use set_index():
df.set_index('Datetime', inplace=True)
returns
Year Month Day Random Integer
Datetime
2015-01-05 2015 1 5 7
2015-01-15 2015 1 15 5
2015-01-25 2015 1 25 6
2015-02-05 2015 2 5 4
2015-02-15 2015 2 15 1
2015-02-25 2015 2 25 2
2015-03-05 2015 3 5 4
2015-03-15 2015 3 15 9
2015-03-25 2015 3 25 3
2015-04-05 2015 4 5 1
2015-04-15 2015 4 15 3
2015-04-25 2015 4 25 7
Can be used to plot a time series:
Note to reset the index:
df.reset_index(inplace=True)