How to add metadata to a data frame with pandas in python ?

Published: October 22, 2019

DMCA.com Protection Status

Example of how to add metadata to a data frame with pandas in python:

Create a data frame with pandas

Example of how to create a simple data frame with pandas

import pandas as pd
import numpy as np

data = np.arange(1,13)
data = data.reshape(3,4)

columns = ['Home','Car','Sport','Food']
index = ['Alice','Bob','Emma']

df = pd.DataFrame(data=data,index=index,columns=columns)

Add metadata

A solution to add metadata:

df.scale = 0.1
df.offset = 15

print(df.scale)
print(df.offset)

returns

0.1
15

Store in a hdf5 file

To save a pandas data frame with metadata a solution is to use an hdf5 file (see Save additional attributes in Pandas Dataframe)

store = pd.HDFStore('data.hdf5')

store.put('dataset_01', df)

metadata = {'scale':0.1,'offset':15}

store.get_storer('dataset_01').attrs.metadata = metadata

store.close()

Read a hdf5 file using pandas

Example of how to read the file using pandas

import pandas as pd

with pd.HDFStore('data.hdf5') as store:
    data = store['dataset_01']
    metadata = store.get_storer('dataset_01').attrs.metadata

print(data)

print(metadata)

returns

       Home  Car  Sport  Food
Alice     1    2      3     4
Bob       5    6      7     8
Emma      9   10     11    12
{'scale': 0.1, 'offset': 15}

References