How to Download Integrated Water Vapor from ERA5 with Python

Overview

Integrated water vapor, also called total column water vapor or precipitable water vapor, is the vertically integrated amount of water vapor in the atmospheric column. It is an important variable for atmospheric correction, cloud studies, weather analysis, and satellite remote-sensing applications.

In this tutorial, we show how to download ERA5 total column water vapor using Python, read the data with xarray, and create publication-style maps over the continental United States.

ERA5 is ECMWF’s fifth-generation atmospheric reanalysis. The Copernicus Climate Data Store provides ERA5 hourly data on single levels from 1940 onward. The variable used here is total_column_water_vapour, available in units of kg m-2. (Climate Data Store)

A useful interpretation is:

1
1 kg m-2  1 mm of precipitable water

So if ERA5 reports:

1
tcwv = 25 kg m-2

that corresponds approximately to:

1
25 mm = 2.5 cm of precipitable water

What is total column water vapor?

Total column water vapor, often abbreviated as TCWV, represents the amount of water vapor contained in a vertical atmospheric column from the surface to the top of the atmosphere. It is also commonly referred to as:

Name Meaning
TCWV Total column water vapor
IWV Integrated water vapor
PWV Precipitable water vapor
PW Precipitable water

ERA5 provides this quantity as a gridded variable named:

1
total_column_water_vapour

In the downloaded NetCDF file, the short variable name is usually:

1
tcwv

The units are:

1
kg m-2

The Copernicus documentation lists total column water vapour as the total amount of water vapour in a column extending from the surface to the top of the atmosphere, with units of kg m-2. (Climate Data Store)

Why use ERA5?

ERA5 is useful because it provides globally complete, hourly atmospheric fields. For satellite remote-sensing applications, TCWV can be used as a first-order indicator of atmospheric absorption, especially in infrared spectral regions affected by water vapor.

For example, in fire remote sensing, water vapor affects the atmospheric transmittance near the 3.7–4.0 µm fire-sensitive window. TCWV alone is not a full radiative-transfer solution, but it is a useful first variable to explore when building an atmospheric correction workflow.

Requirements

Install the required Python packages:

1
pip install cdsapi xarray netCDF4 matplotlib cartopy

Or with conda:

1
conda install -c conda-forge cdsapi xarray netcdf4 matplotlib cartopy

You also need a Copernicus Climate Data Store account .

How to Download Integrated Water Vapor from ERA5 with Python
How to Download Integrated Water Vapor from ERA5 with Python

After creating your account, configure your CDS API credentials in (CDSAPI setup):

1
~/.cdsapirc

The file should look similar to:

1
2
url: https://cds.climate.copernicus.eu/api
key: YOUR_PERSONAL_ACCESS_TOKEN

You must also accept the license terms for the ERA5 single-level dataset in the CDS web interface. Otherwise, Python requests may fail with an error such as :

1
required licences not accepted

How to Download Integrated Water Vapor from ERA5 with Python
How to Download Integrated Water Vapor from ERA5 with Python

Download ERA5 total column water vapor for one full day

The following example downloads hourly total column water vapor over an approximate CONUS domain for August 8, 2019.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import cdsapi

dataset = "reanalysis-era5-single-levels"

request = {
    "product_type": ["reanalysis"],
    "variable": ["total_column_water_vapour"],
    "year": ["2019"],
    "month": ["08"],
    "day": ["08"],
    "time": [
        "00:00", "01:00", "02:00", "03:00",
        "04:00", "05:00", "06:00", "07:00",
        "08:00", "09:00", "10:00", "11:00",
        "12:00", "13:00", "14:00", "15:00",
        "16:00", "17:00", "18:00", "19:00",
        "20:00", "21:00", "22:00", "23:00",
    ],
    "data_format": "netcdf",
    "download_format": "unarchived",

    # Bounding box: North, West, South, East
    "area": [50.0, -130.0, 24.0, -65.0],
}

client = cdsapi.Client()
client.retrieve(dataset, request).download("era5_tcwv_conus_20190808.nc")

The area field uses the order:

1
[North, West, South, East]

For this example:

1
[50.0, -130.0, 24.0, -65.0]

means:

Boundary Value
North 50.0°N
West 130.0°W
South 24.0°N
East 65.0°W

This covers the continental United States with some buffer.

Open the ERA5 NetCDF file with xarray

1
2
3
4
5
import xarray as xr

ds = xr.open_dataset("era5_tcwv_conus_20190808.nc")

print(ds)

How to Download Integrated Water Vapor from ERA5 with Python
How to Download Integrated Water Vapor from ERA5 with Python

The TCWV variable is usually named:

1
tcwv = ds["tcwv"]

Inspect the variable:

1
2
3
4
5
tcwv = ds["tcwv"]

print(tcwv)
print(tcwv.attrs)
print(tcwv.coords)

How to Download Integrated Water Vapor from ERA5 with Python
How to Download Integrated Water Vapor from ERA5 with Python

Depending on the CDS output, the time coordinate may be named either valid_time or time. The following code detects it automatically:

1
2
3
time_name = "valid_time" if "valid_time" in tcwv.coords else "time"

print("Time coordinate:", time_name)

Plot one hourly TCWV map

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import matplotlib.pyplot as plt

tcwv0 = tcwv.isel({time_name: 0})

plt.figure(figsize=(10, 6))

tcwv0.plot(
    cmap="viridis",
    cbar_kwargs={"label": "TCWV (kg m$^{-2}$ / mm)"}
)

plt.title(f"ERA5 Total Column Water Vapour over CONUS\n{tcwv0[time_name].values}")
plt.xlabel("Longitude")
plt.ylabel("Latitude")
plt.show()

This gives a quick map of total column water vapor for the first downloaded hour.

How to Download Integrated Water Vapor from ERA5 with Python
How to Download Integrated Water Vapor from ERA5 with Python

Publication-style map with coastlines, borders, and states

For cleaner maps, use cartopy.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature

tcwv0 = tcwv.isel({time_name: 0})

fig = plt.figure(figsize=(11, 6.5))

ax = fig.add_axes(
    [0.07, 0.12, 0.78, 0.76],
    projection=ccrs.PlateCarree()
)

cax = fig.add_axes([0.88, 0.18, 0.025, 0.62])

p = tcwv0.plot(
    ax=ax,
    transform=ccrs.PlateCarree(),
    cmap="viridis",
    add_colorbar=False
)

cbar = fig.colorbar(p, cax=cax)
cbar.set_label("Total column water vapour (kg m$^{-2}$ / mm)")

ax.add_feature(cfeature.COASTLINE, linewidth=0.8)
ax.add_feature(cfeature.BORDERS, linewidth=0.8)
ax.add_feature(cfeature.STATES, linewidth=0.35, edgecolor="white", alpha=0.8)

ax.set_extent([-130, -65, 24, 50], crs=ccrs.PlateCarree())

gl = ax.gridlines(
    draw_labels=True,
    linewidth=0.3,
    alpha=0.4,
    linestyle="--"
)

gl.top_labels = False
gl.right_labels = False

ax.set_title(
    f"ERA5 Total Column Water Vapour over CONUS\n{tcwv0[time_name].values}",
    fontsize=13
)

plt.show()

How to Download Integrated Water Vapor from ERA5 with Python
How to Download Integrated Water Vapor from ERA5 with Python

The colorbar position is controlled manually with:

1
cax = fig.add_axes([0.88, 0.18, 0.025, 0.62])

The values mean:

1
[left, bottom, width, height]

So to make the colorbar shorter, reduce the height:

1
cax = fig.add_axes([0.88, 0.22, 0.025, 0.50])

Plot the daily mean TCWV

Since the file contains 24 hourly fields, we can calculate the daily mean:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
tcwv_daily_mean = tcwv.mean(dim=time_name)

plt.figure(figsize=(10, 6))

tcwv_daily_mean.plot(
    cmap="viridis",
    cbar_kwargs={"label": "Daily mean TCWV (kg m$^{-2}$ / mm)"}
)

plt.title("ERA5 Daily Mean Total Column Water Vapour over CONUS\n2019-08-08")
plt.xlabel("Longitude")
plt.ylabel("Latitude")
plt.show()

How to Download Integrated Water Vapor from ERA5 with Python
How to Download Integrated Water Vapor from ERA5 with Python

A publication-style version:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
fig = plt.figure(figsize=(11, 6.5))

ax = fig.add_axes(
    [0.07, 0.12, 0.78, 0.76],
    projection=ccrs.PlateCarree()
)

cax = fig.add_axes([0.88, 0.18, 0.025, 0.62])

p = tcwv_daily_mean.plot(
    ax=ax,
    transform=ccrs.PlateCarree(),
    cmap="viridis",
    add_colorbar=False
)

cbar = fig.colorbar(p, cax=cax)
cbar.set_label("Daily mean TCWV (kg m$^{-2}$ / mm)")

ax.add_feature(cfeature.COASTLINE, linewidth=0.8)
ax.add_feature(cfeature.BORDERS, linewidth=0.8)
ax.add_feature(cfeature.STATES, linewidth=0.35, edgecolor="white", alpha=0.8)

ax.set_extent([-130, -65, 24, 50], crs=ccrs.PlateCarree())

gl = ax.gridlines(
    draw_labels=True,
    linewidth=0.3,
    alpha=0.4,
    linestyle="--"
)

gl.top_labels = False
gl.right_labels = False

ax.set_title(
    "ERA5 Daily Mean Total Column Water Vapour over CONUS\n2019-08-08",
    fontsize=13
)

plt.show()

How to Download Integrated Water Vapor from ERA5 with Python
How to Download Integrated Water Vapor from ERA5 with Python

Convert TCWV to precipitable water in centimeters

ERA5 TCWV is in kg m-2, which is numerically equivalent to millimeters of precipitable water.

Therefore:

1
2
tcwv_mm = tcwv
tcwv_cm = tcwv / 10.0

You can add metadata:

1
2
3
tcwv_cm = tcwv / 10.0
tcwv_cm.attrs["long_name"] = "Total column water vapour"
tcwv_cm.attrs["units"] = "cm"

Then plot:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
tcwv0_cm = tcwv_cm.isel({time_name: 0})

plt.figure(figsize=(10, 6))

tcwv0_cm.plot(
    cmap="viridis",
    cbar_kwargs={"label": "TCWV (cm)"}
)

plt.title(f"ERA5 Total Column Water Vapour over CONUS\n{tcwv0_cm[time_name].values}")
plt.xlabel("Longitude")
plt.ylabel("Latitude")
plt.show()

Plot a domain-mean time series

To see the hourly evolution over the domain:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
tcwv_mean = tcwv.mean(dim=["latitude", "longitude"])

plt.figure(figsize=(8, 4))

tcwv_mean.plot(marker="o")

plt.title("Domain-Mean ERA5 Total Column Water Vapour")
plt.ylabel("TCWV (kg m$^{-2}$ / mm)")
plt.xlabel("Time")
plt.grid(True, alpha=0.3)
plt.show()

How to Download Integrated Water Vapor from ERA5 with Python
How to Download Integrated Water Vapor from ERA5 with Python

You can also plot minimum, mean, and maximum:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
tcwv_min = tcwv.min(dim=["latitude", "longitude"])
tcwv_mean = tcwv.mean(dim=["latitude", "longitude"])
tcwv_max = tcwv.max(dim=["latitude", "longitude"])

plt.figure(figsize=(8, 4))

plt.plot(tcwv[time_name], tcwv_min, marker="o", label="min")
plt.plot(tcwv[time_name], tcwv_mean, marker="o", label="mean")
plt.plot(tcwv[time_name], tcwv_max, marker="o", label="max")

plt.title("ERA5 TCWV Domain Statistics")
plt.ylabel("TCWV (kg m$^{-2}$ / mm)")
plt.xlabel("Time")
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

How to Download Integrated Water Vapor from ERA5 with Python
How to Download Integrated Water Vapor from ERA5 with Python

Interpolate TCWV to a specific latitude and longitude

For satellite applications, we often need the TCWV value at a fire detection, cloud pixel, or validation point.

Example point:

1
2
lat = 45.8326
lon = -120.5

Select the nearest ERA5 hour:

1
2
3
import pandas as pd

target_time = pd.Timestamp("2019-08-08T20:15:00")

Then interpolate:

1
2
3
4
5
6
7
8
tcwv_at_time = tcwv.sel({time_name: target_time}, method="nearest")

tcwv_point = tcwv_at_time.interp(
    latitude=lat,
    longitude=lon
)

print(float(tcwv_point.values))

For many points stored in a pandas DataFrame:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import pandas as pd
import xarray as xr

# Example DataFrame
df = pd.DataFrame({
    "latitude": [40.0, 42.5, 45.0],
    "longitude": [-120.0, -115.0, -110.0],
    "acq_date_time": [
        "2019-08-08T18:20:00",
        "2019-08-08T19:45:00",
        "2019-08-08T20:10:00",
    ],
})

df["acq_date_time"] = pd.to_datetime(df["acq_date_time"])

tcwv_values = []

for _, row in df.iterrows():
    tcwv_at_time = tcwv.sel(
        {time_name: row["acq_date_time"]},
        method="nearest"
    )

    value = tcwv_at_time.interp(
        latitude=row["latitude"],
        longitude=row["longitude"]
    )

    tcwv_values.append(float(value.values))

df["tcwv_kg_m2"] = tcwv_values
df["tcwv_cm"] = df["tcwv_kg_m2"] / 10.0

print(df)

Optional: create a first-order 3.9 µm transmittance proxy

TCWV is not the same as atmospheric transmittance. However, for satellite fire applications, it can be used to build a simple first-order proxy.

A basic Beer-Lambert-style approximation is:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
import numpy as np

def clear_sky_transmittance_3p9_from_tcwv(
    tcwv_kg_m2,
    view_zenith_deg,
    a=0.04,
    b=0.06
):
    """
    First-order approximate 3.9 µm clear-sky transmittance
    from ERA5 total column water vapour.

    Parameters
    ----------
    tcwv_kg_m2 : float or array-like
        ERA5 total column water vapour in kg m-2.
        Numerically equivalent to mm precipitable water.

    view_zenith_deg : float or array-like
        Satellite view zenith angle in degrees.

    a, b : float
        Empirical coefficients. These should eventually be tuned
        or replaced by a radiative-transfer lookup table.

    Returns
    -------
    T : float or array-like
        Approximate atmospheric transmittance between 0 and 1.
    """

    tcwv_cm = tcwv_kg_m2 / 10.0
    mu = np.cos(np.deg2rad(view_zenith_deg))

    tau = a + b * tcwv_cm

    return np.exp(-tau / mu)

Example:

1
2
3
4
5
6
7
8
9
view_zenith_deg = 35.0

T_clear = clear_sky_transmittance_3p9_from_tcwv(
    tcwv_kg_m2=tcwv,
    view_zenith_deg=view_zenith_deg
)

T_clear.attrs["long_name"] = "Approximate clear-sky 3.9 µm transmittance"
T_clear.attrs["units"] = "1"

Plot the first time step:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
T0 = T_clear.isel({time_name: 0})

plt.figure(figsize=(10, 6))

T0.plot(
    cmap="viridis",
    vmin=0,
    vmax=1,
    cbar_kwargs={"label": "Approximate 3.9 µm transmittance"}
)

plt.title(f"Approximate Clear-Sky 3.9 µm Transmittance\nVZA = {view_zenith_deg}°")
plt.xlabel("Longitude")
plt.ylabel("Latitude")
plt.show()

This is only a first-order proxy. A physically stronger approach would use RTTOV, MODTRAN, or libRadtran with vertical profiles, surface elevation, viewing geometry, and the sensor spectral response function.

Important notes

ERA5 TCWV is useful, but it should not be confused with full atmospheric transmittance.

Quantity Meaning
TCWV Amount of water vapor in the atmospheric column
Transmittance Fraction of radiation transmitted through the atmosphere
Cloud optical thickness Optical attenuation by clouds
Aerosol optical depth Optical attenuation by aerosols or smoke

For fire remote sensing, especially near 3.7–4.0 µm, water vapor is only one part of the atmospheric correction problem. A full correction also depends on the vertical humidity and temperature profiles, viewing angle, pressure, elevation, aerosols, and the sensor spectral response function.

Conclusion

In this tutorial, we downloaded ERA5 total column water vapor for a full day over CONUS, opened the data with xarray, plotted hourly and daily maps, and extracted TCWV values at specific locations.

This is a useful first step toward atmospheric correction workflows. For satellite fire applications, TCWV can help explain spatial and temporal variations in water-vapor absorption and can be used as a first-order input for estimating clear-sky 3.9 µm transmittance.

The next step is to combine TCWV with satellite viewing geometry and compare the resulting approximate transmittance against a reference product or replace the empirical approximation with a radiative-transfer model.

References

Links Site
ERA5 hourly data on single levels from 1940 to present Copernicus Climate Data Store
ERA5 daily statistics on single levels Copernicus Climate Data Store
CDS API setup and documentation Copernicus Climate Data Store
xarray documentation xarray
Cartopy documentation Cartopy
Matplotlib documentation Matplotlib
ERA5 documentation ECMWF Confluence
ERA5 parameter database: total column water vapour ECMWF Parameter Database
Image

of