Examples of how to store (convert/transform) a pandas dataframe into a json file
Create a dataframe with pandas
Let's first create a simple dataframe:
import pandas as pdimport numpy as npdata = np.random.randint(99, size=(3,3))df = pd.DataFrame(data=data,columns=['A','B','C'])print(df)
returns for example
A B C0 78 96 01 4 32 762 72 16 74
Convert the dataframe into a json string representation
To convert a datframe into a into a json, a first step is to use pandas.DataFrame.to_json, here is an example using the opition orient="split" (most commonly used):
res = df.to_json(orient="split")
returns
{"columns":["A","B","C"],"index":[0,1,2],"data":[[78,96,0],[4,32,76],[72,16,74]]}
Note that:
type(res)
is a string represenation.
Examples using different options:
df.to_json()
returns
{"A":{"0":78,"1":4,"2":72},"B":{"0":96,"1":32,"2":16},"C":{"0":0,"1":76,"2":74}}
same as
df.to_json(orient="columns")
returns
{"A":{"0":78,"1":4,"2":72},"B":{"0":96,"1":32,"2":16},"C":{"0":0,"1":76,"2":74}}
Using orient="index"
df.to_json(orient="index")
returns
{"0":{"A":78,"B":96,"C":0},"1":{"A":4,"B":32,"C":76},"2":{"A":72,"B":16,"C":74}}
Using orient="records"
df.to_json(orient="records")
returns
[{"A":78,"B":96,"C":0},{"A":4,"B":32,"C":76},{"A":72,"B":16,"C":74}]
Using orient="values"
df.to_json(orient="values")
returns
[[78,96,0],[4,32,76],[72,16,74]]
Using orient="table"
df.to_json(orient="table")
returns
{"schema":{"fields":[{"name":"index","type":"integer"},{"name":"A","type":"integer"},{"name":"B","type":"integer"},{"name":"C","type":"integer"}],"primaryKey":["index"],"pandas_version":"0.20.0"},"data":[{"index":0,"A":78,"B":96,"C":0},{"index":1,"A":4,"B":32,"C":76},{"index":2,"A":72,"B":16,"C":74}]}
Convert the string to a dictionary
Next step convert the string to a dictionary using the json python module:
import jsonparsed = json.loads(res)print( type(parsed) )print( json.dumps(parsed, indent=4) )
returns
<class 'dict'>
and
{"columns": ["A","B","C"],"index": [0,1,2],"data": [[78,96,0],[4,32,76],[72,16,74]]}
Save the dictionary into a json file
with open('data.json', 'w') as fp:json.dump(res_d, fp)
Try to read the json file with python
Check if it works:
with open('data.json') as json_data:data_dict = json.load(json_data)print(data_dict)
should returns
{'columns': ['A', 'B', 'C'], 'index': [0, 1, 2], 'data': [[78, 96, 0], [4, 32, 76], [72, 16, 74]]}
Another example with multiple indexes
tuples = [('Land', 'Liquid'),('Land', 'Ice'),('Ocean', 'Liquid'),('Ocean', 'Ice')]index = pd.MultiIndex.from_tuples(tuples, names=["Id1", "Id2"])df = pd.DataFrame({'Count A': [12., 70., 30., 20.], 'Count B': [12., 70., 30., 20.]}, index=index)print(df)
returns
Count A Count BId1 Id2Land Liquid 12.0 12.0Ice 70.0 70.0Ocean Liquid 30.0 30.0Ice 20.0 20.0
then
df.to_json(orient="split")
returns
{"columns":["Count A","Count B"],"index":[["Land","Liquid"],["Land","Ice"],["Ocean","Liquid"],["Ocean","Ice"]],"data":[[12.0,12.0],[70.0,70.0],[30.0,30.0],[20.0,20.0]]}
and
import jsonres = df.to_json(orient="split")res_d = json.loads(res)with open('data.json', 'w') as fp:json.dump(res_d, fp)
create a json file.
see also How to save a dictionary in a json file with python ?
