How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset

Published: January 02, 2024

Updated: January 03, 2025

Tags: Python; Matplotlib; Pandas;

DMCA.com Protection Status

Introduction

A time series is a sequence of data points recorded at successive time intervals. Analyzing time series data is crucial for uncovering patterns, trends, and anomalies that can inform decision-making across various fields, such as climate science, economics, and engineering.

In this tutorial, we’ll use Python to create, process, and visualize a time series of atmospheric CO₂ levels using the Mauna Loa dataset. This guide is designed for beginners and intermediate users interested in applying Python for time series analysis.

Dataset Overview

The Mauna Loa CO2 dataset is a globally recognized resource for studying atmospheric carbon dioxide concentrations. Managed by NOAA, it has provided continuous monthly CO2 measurements since 1958, offering invaluable insights into long-term climate trends. The dataset includes fields such as year, month, average CO2 concentration, and deseasonalized CO2 levels.

You can download the dataset from NOAA’s website here.

Import Libraries and Load the Data

First, we import the necessary libraries and load the dataset into a pandas DataFrame.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import datetime
import warnings

warnings.filterwarnings('ignore')

# Load CO₂ data
co2_data = pd.read_csv(
    "./inputs/co2_mm_mlo.csv",  # Replace with your file path
    skiprows=40,
    na_values=["-99.99"])

print(co2_data)

Output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
year  month  decimal date  average  deseasonalized  ndays  sdev   unc
0    1958      3     1958.2027   315.71          314.44     -1 -9.99 -0.99
1    1958      4     1958.2877   317.45          315.16     -1 -9.99 -0.99
2    1958      5     1958.3699   317.51          314.69     -1 -9.99 -0.99
3    1958      6     1958.4548   317.27          315.15     -1 -9.99 -0.99
4    1958      7     1958.5370   315.87          315.20     -1 -9.99 -0.99
..    ...    ...           ...      ...             ...    ...   ...   ...
796  2024      7     2024.5417   425.55          425.11     24  0.69  0.27
797  2024      8     2024.6250   422.99          424.83     22  1.08  0.44
798  2024      9     2024.7083   422.03          425.44     18  0.41  0.18
799  2024     10     2024.7917   422.38          425.63     22  0.35  0.14
800  2024     11     2024.8750   423.85          425.84     24  0.33  0.13

[801 rows x 8 columns]

Process and Format the Data

To analyze the dataset effectively, we clean it by removing missing values and adding a datetime column for easier indexing and visualization.

1
2
3
4
co2_data = co2_data.dropna()
co2_data['date'] = pd.to_datetime(co2_data[['year', 'month']].assign(day=1))

print( co2_data.dtypes )

Output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
year                       int64
month                      int64
decimal date             float64
average                  float64
deseasonalized           float64
ndays                      int64
sdev                     float64
unc                      float64
date              datetime64[ns]
dtype: object

Visualize the Time Series

Example of how to extract date column values

1
co2_data['date']

Output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
0     1958-03-01
1     1958-04-01
2     1958-05-01
3     1958-06-01
4     1958-07-01
         ...    
796   2024-07-01
797   2024-08-01
798   2024-09-01
799   2024-10-01
800   2024-11-01
Name: date, Length: 801, dtype: datetime64[ns]

Example of how to extract C02 concentation column values

1
co2_data['average']

Output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
0      315.71
1      317.45
2      317.51
3      317.27
4      315.87
        ...  
796    425.55
797    422.99
798    422.03
799    422.38
800    423.85
Name: average, Length: 801, dtype: float64

Initial Visualization

The first step in time series analysis is visualization. Here, we plot the CO₂ levels over time.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
x_values = co2_data['date']
y_values = co2_data['average']

plt.plot(x_values,y_values)

plt.title("Mauna Loa C02", fontsize=11)

plt.savefig('./outputs/time_series_01.png', dpi=100, bbox_inches='tight')

plt.show()

Ouput

How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset
How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset

Zooming into a Specific Period

We can focus on specific time periods by filtering the dataset. Here, we examine CO₂ levels from 1980 to 1984.

1
2
3
4
5
start_date = datetime.datetime(1980, 1, 1)
end_date = datetime.datetime(1984, 10, 1)

x_sub_values = co2_data['date'][ (co2_data['date'] >  start_date) & (co2_data['date'] <  end_date) ]
y_sub_values = co2_data['average'][ (co2_data['date'] >  start_date) & (co2_data['date'] <  end_date) ]

Output

1
2
3
4
5
6
7
plt.plot(x_sub_values,y_sub_values)

plt.title("Mauna Loa C02", fontsize=11)

plt.savefig('./outputs/time_series_02.png', dpi=100, bbox_inches='tight')

plt.show()

How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset
How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset

Customizing the Time Series Plot

Customizing plots enhances their interpretability. Below are some common customizations.

Increasing Figure Size

To make the plot more readable and visually appealing, especially for presentations or reports, increasing the figure size is a key step. A larger figure ensures that details, such as trends and fluctuations in the time series, are clearly visible. Here’s how you can adjust the figure size and incorporate additional elements for a polished visualization:

1
2
3
4
5
6
7
8
9
fig, ax = plt.subplots(figsize=(16,4))

plt.plot(x_values,y_values)

plt.title("Mauna Loa C02", fontsize=11)

plt.savefig('./outputs/time_series_03.png', dpi=100, bbox_inches='tight')

plt.show()

Figure Size: Set to (16, 4) for a wide, panoramic view of the time series, making it easier to observe long-term trends.

How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset
How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset

Adding Axis Titles

Axis titles are essential for clearly conveying the meaning of the data plotted on each axis. This helps viewers understand the context of the visualization without additional explanation.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
fig, ax = plt.subplots(figsize=(16,4))

plt.plot(x_values,y_values)

plt.title("Mauna Loa C02", fontsize=12)

plt.xlabel('Days', fontsize=11)
plt.ylabel('Values', fontsize=11)

plt.savefig('./outputs/time_series_04.png', dpi=100, bbox_inches='tight')

plt.show()

How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset
How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset

The labelpad parameter is a useful feature in Matplotlib that adds extra space between the axis labels and the axis itself. This improves readability and prevents the labels from appearing too close to the plot area, especially in complex or dense visualizations.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
fig, ax = plt.subplots(figsize=(16,4))

plt.plot(x_values,y_values)

plt.title("Mauna Loa C02", fontsize=12)

plt.xlabel('Days', fontsize=11, labelpad=20)
plt.ylabel('Values', fontsize=11, labelpad=20)

plt.savefig('./outputs/time_series_05.png', dpi=100, bbox_inches='tight')

plt.show()

How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset
How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset

Adding Custom Tick Marks

Customizing tick marks is a powerful way to make plots more readable and informative, especially when working with time series data. In this example, we define specific positions and labels for the x-axis tick marks, selecting them at regular intervals to avoid clutter.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
fig, ax = plt.subplots(figsize=(16,4))

x_tick_positions = co2_data['date']
x_tick_labels = co2_data['date']

x_tick_positions = [d for d in x_tick_positions if d.year % 10 == 0 and d.month == 1]
x_tick_labels = [d for d in x_tick_positions if d.year % 10 == 0 and d.month == 1]

plt.plot(x_values,y_values)

plt.title("Mauna Loa C02", fontsize=12)

plt.xlabel('Days', fontsize=11, labelpad=20)
plt.ylabel('Values', fontsize=11, labelpad=20)

plt.xticks(x_tick_positions, x_tick_labels, rotation=90, fontsize=11)

plt.savefig('./outputs/time_series_07.png', dpi=100, bbox_inches='tight')

plt.show()

How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset
How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset

Formatted Tick Labels (see python datetime):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
x_tick_positions = co2_data['date']
x_tick_labels = co2_data['date']

x_tick_positions = [d for d in x_tick_positions if d.year % 10 == 0 and d.month == 1]
x_tick_labels = [d for d in x_tick_positions if d.year % 10 == 0 and d.month == 1]

x_tick_labels = [ l.strftime("%Y-%m-%d") for l in  x_tick_labels]

fig, ax = plt.subplots(figsize=(16,4))

plt.plot(x_values,y_values)

plt.title("Mauna Loa C02", fontsize=12)

plt.xlabel('Days', fontsize=11, labelpad=20)
plt.ylabel('Values', fontsize=11, labelpad=20)

plt.xticks(x_tick_positions, x_tick_labels, rotation=90, fontsize=11)

plt.savefig('./outputs/time_series_08.png', dpi=100, bbox_inches='tight')

plt.show()

How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset
How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset

Customizing Matplotlib Axis Colors

Customizing the axis colors in Matplotlib can enhance the visual appeal and highlight important aspects of the plot. By modifying the axis spines, tick marks, and labels, you can make your visualizations more visually appealing and easier to interpret.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
fig, ax = plt.subplots(figsize=(16,4))

plt.plot(x_values,y_values)

plt.title("Mauna Loa C02", fontsize=12)

plt.xlabel('Days', fontsize=11, labelpad=20)
plt.ylabel('Values', fontsize=11, labelpad=20)

plt.xticks(x_tick_positions, x_tick_labels, rotation=90, fontsize=11, color='red')

plt.savefig('./outputs/time_series_09.png', dpi=100, bbox_inches='tight')

plt.show()

How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset
How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset

Increasing padding between tick labels and axis

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
fig, ax = plt.subplots(figsize=(16,4))

plt.plot(x_values,y_values)

plt.title("Mauna Loa C02", fontsize=12)

plt.xlabel('Days', fontsize=11, labelpad=20)
plt.ylabel('Values', fontsize=11, labelpad=20)

plt.xticks(x_tick_positions, x_tick_labels, rotation=90, fontsize=11, color='red')

ax.tick_params(axis='x', which='major', pad=15)

plt.savefig('./outputs/time_series_10.png', dpi=100, bbox_inches='tight')

plt.show()

How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset
How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset

Using Seaborn to Enhance the Plot

Seaborn is a powerful Python visualization library built on top of Matplotlib. It provides high-level interfaces for creating attractive and informative statistical graphics with better aesthetics and functionality compared to traditional Matplotlib plots.

By using Seaborn's built-in styles, color palettes, and improved plotting techniques, you can create visually appealing plots with minimal customization. Seaborn's integration with Pandas also makes it easy to work with data frames directly.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import seaborn as sn
import seaborn as sns; sns.set()

fig, ax = plt.subplots(figsize=(16,4))

plt.plot(x_values,y_values)

plt.title("Mauna Loa C02", fontsize=12)

plt.xlabel('Days', fontsize=11, labelpad=20)
plt.ylabel('Values', fontsize=11, labelpad=20)

plt.savefig('./outputs/time_series_11.png', dpi=100, bbox_inches='tight')

plt.show()

How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset
How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset

Advanced Visualization: Dual Axes Plot

Visualizing Data with Two Y-Axes

In this advanced plot, we combine two distinct data sets—atmospheric CO₂ concentrations and global temperature anomalies—on the same plot using dual y-axes. This technique allows us to visualize two data series with different units or scales on a single figure. For this visualization, we'll use Matplotlib’s twinx() function, which provides a second y-axis sharing the same x-axis, enabling a more intuitive comparison of related data.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Load CO₂ data
co2_data = pd.read_csv(
    "./inputs/co2_mm_mlo.csv",  # Replace with your file path
    skiprows=40,
    na_values=["-99.99"])

co2_data = co2_data.dropna()
co2_data['date'] = pd.to_datetime(co2_data[['year', 'month']].assign(day=1))

# Load temperature anomaly data
temp_data = pd.read_csv(
    "./inputs/GLB.Ts+dSST.csv",  # Replace with your file path
    skiprows=1
)

temp_data = temp_data.rename(columns={"Year": "year", "J-D": "anomaly"})
temp_data = temp_data[['year', 'anomaly']]
temp_data['date'] = pd.to_datetime(temp_data['year'].astype(str) + '-07-01')  # Mid-year for annual data

start_date = max(co2_data['date'].min(), temp_data['date'].min())
end_date = min(co2_data['date'].max(), temp_data['date'].max())
co2_filtered = co2_data[(co2_data['date'] >= start_date) & (co2_data['date'] <= end_date)]
temp_filtered = temp_data[(temp_data['date'] >= start_date) & (temp_data['date'] <= end_date)]

# Ensure 'anomaly' column is numeric
temp_filtered['anomaly'] = pd.to_numeric(temp_filtered['anomaly'], errors='coerce')

# Drop rows with NaN values in 'anomaly' after conversion
temp_filtered = temp_filtered.dropna(subset=['anomaly'])

# Separate data into positive and negative anomalies
positive_anomalies = temp_filtered[temp_filtered['anomaly'] > 0]
negative_anomalies = temp_filtered[temp_filtered['anomaly'] <= 0]

# Plotting
fig, ax1 = plt.subplots(figsize=(10, 6))

# Plot CO₂ data
ax1.plot(co2_filtered['year'], co2_filtered['average'], 'g-', label='CO2 (ppm)')
ax1.set_xlabel('Year')
ax1.set_ylabel('CO2 Concentration (ppm)', color='g')
ax1.tick_params(axis='y', colors='g')
ax1.legend(loc='upper left')

# Add a second axis for temperature anomalies
ax2 = ax1.twinx()

# Plot bars with different colors
ax2.bar(positive_anomalies['year'], positive_anomalies['anomaly'], color='r', alpha=0.7, label='Positive Anomaly')
ax2.bar(negative_anomalies['year'], negative_anomalies['anomaly'], color='b', alpha=0.7, label='Negative Anomaly')

# Set secondary y-axis label and formatting
ax2.set_ylabel('Temperature Anomaly (°C)', color='k')
ax2.tick_params(axis='y', colors='k')
ax2.legend(loc='upper right')

# Add title and grid
plt.title('Atmospheric CO2 and Global Temperature Anomalies')
plt.grid()
plt.tight_layout()

plt.savefig('./outputs/time_series_12.png', dpi=100, bbox_inches='tight')

plt.show()

How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset
How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset

The twinx() function allows the creation of two y-axes: one for CO2 concentrations (ppm) and another for temperature anomalies (°C). This is particularly useful when comparing two data sets with different units and scales.

Visualizing Data with Two X-Axes

A dual x-axis plot is useful when comparing two datasets with different time intervals. For example, you might have observations from different sensors that record data at different frequencies. This visualization technique allows you to plot both datasets on the same graph with separate x-axes, providing a clear comparison.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
fig, ax = plt.subplots(figsize=(16,4))

#-------------------------------------------------------------#

x_tick_positions = co2_data['date']
x_tick_labels = co2_data['date']

x_tick_positions = [d for d in x_tick_positions if d.year % 10 == 0 and d.month == 1]
x_tick_labels = [d for d in x_tick_positions if d.year % 10 == 0 and d.month == 1]

x_tick_labels = [ l.strftime("%Y-%m-%d") for l in  x_tick_labels]

plt.plot(x_values,y_values)

plt.xlabel('Obs 1', fontsize=11, labelpad=20)
plt.ylabel('Values', fontsize=11, labelpad=20)

plt.xticks(x_tick_positions, x_tick_labels, rotation=90, fontsize=11)

#-------------------------------------------------------------#

x_tick_positions = co2_data['date']
x_tick_labels = co2_data['date']

x_tick_positions = [d for d in x_tick_positions if d.year % 5 == 0 and d.month == 1]
x_tick_labels = [d for d in x_tick_positions if d.year % 5 == 0 and d.month == 1]

x_tick_labels = [ l.strftime("%Y-%m-%d") for l in  x_tick_labels]

def identity_func(x):
    return x

ax_secondary = ax.secondary_xaxis('top', functions=(identity_func, identity_func))

ax_secondary.tick_params(axis='x', colors='black', labelsize=11) 
ax_secondary.set_xlabel('Obs 2', color='black', fontsize=11, labelpad=20)

ax_secondary.set_xticks(x_tick_positions)
ax_secondary.set_xticklabels(x_tick_labels, rotation=90, fontsize=11)

#-------------------------------------------------------------#

plt.savefig('./outputs/time_series_13.png', dpi=100, bbox_inches='tight')

plt.show()

How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset
How To Create And Plot Time Series in Python ? A Step-by-Step Guide with the Mauna Loa CO2 Dataset

References

Links Site
pandas.DataFrame.set_index pandas.pydata.org
pandas.DataFrame.index pandas.pydata.org
python datetime www.w3schools.com
C02 NOAA Dataset gml.noaa.gov
Image

of