How to get data type of each column in a Pandas DataFrame ?

Published: April 01, 2023

Updated: April 07, 2023

Tags: Python; Pandas; Dataframe;

DMCA.com Protection Status

To determine the data type of each column in a Pandas DataFrame, a solution is to use dtypes, examples:

Get data type of each column dtypes()

Let's consider the following dataframe:

import pandas as pd

data = { 'c1':[1,2,3,4,5,6],
         'c2':[1.,2.,3.,4.,5.,6.],
         'c3':['a','b','b','d','e','f']
}

df = pd.DataFrame(data)

print(df)

Ouput

   c1   c2 c3
0   1  1.0  a
1   2  2.0  b
2   3  3.0  b
3   4  4.0  d
4   5  5.0  e
5   6  6.0  f

To retrieve data type of each column, enter

df.dtypes

which will return a Series containing the data type for each column of the original DataFrame:

c1      int64
c2    float64
c3     object
dtype: object

Using info()

It is also possible to get the data type of each column using the info() method on a DataFrame. This will provide information about all columns in the DataFrame, including the data type.

For example:

 df.info()

This would return something like this:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6 entries, 0 to 5
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   c1      6 non-null      int64  
 1   c2      6 non-null      float64
 2   c3      6 non-null      object 
dtypes: float64(1), int64(1), object(1)
memory usage: 272.0+ bytes

This example shows that the 'c3' column holds objects, 'c2' contains float64, and 'c1' has int64 data types.

Example of use

Merging two DataFrames

This is an example why checking the data type of a Pandas dataframe can be helpful. There are two dataframes, df1

            FRP  MASK   Longitude   Latitude
    0       0.0     0 -121.214928  41.868652
    1       0.0     0 -121.214813  41.868549
    2       0.0     0 -121.214699  41.868443
    3       0.0     0 -121.214584  41.868332
    4       0.0     0 -121.214470  41.868225
    ...     ...   ...         ...        ...
    435827  0.0     0 -121.271240  41.782211
    435828  0.0     0 -121.271126  41.782104
    435829  0.0     0 -121.271004  41.781994
    435830  0.0     0 -121.270874  41.781868
    435831  0.0     0 -121.270760  41.781761

and df2

               Longitude   Latitude    A
    0        -120.371639  42.494111 -999
    1        -120.371405  42.493905 -999
    2        -120.371191  42.493716 -999
    3        -120.371054  42.493590 -999
    4        -120.370844  42.493405 -999
    ...              ...        ...  ...
    12422595 -121.414409  41.654469 -999
    12422596 -121.414205  41.654282 -999
    12422597 -121.414020  41.654113 -999
    12422598 -121.413863  41.653969 -999
    12422599 -121.413656  41.653779 -999

, that need to be merged using the latitude and longitude. However, the merge

 pd.merge(df1,df2, on=['Longitude','Latitude'], how='inner')

resulted in an empty dataframe.

The reason is that latitude and longitude may appear similar, but they are stored as different data types:

df1.dtypes

returns

FRP          float32
MASK           int32
Longitude    float32
Latitude     float32
dtype: object

while

df2.dtypes

returns

Longitude    float64
Latitude     float64
A              int64
dtype: object

It can be observed that the data type for latitude and longitude is float32 in df1 and float64 in df2.

One solution to transforming data from float64 to float32 is by making use of astype().

df2['Longitude'] = df2['Longitude'].astype('float32')
df2['Latitude'] = df2['Latitude'].astype('float32')

Now using merge:

 pd.merge(df,df_L1, on=['Longitude','Latitude'], how='inner')

will return

            FRP  MASK   Longitude   Latitude    A
    0       0.0     0 -121.214928  41.868652 -999
    1       0.0     0 -121.214813  41.868549 -999
    2       0.0     0 -121.214699  41.868443 -999
    3       0.0     0 -121.214584  41.868332 -999
    4       0.0     0 -121.214470  41.868225 -999
    ...     ...   ...         ...        ...  ...
    435827  0.0     0 -121.271240  41.782211 -999
    435828  0.0     0 -121.271126  41.782104 -999
    435829  0.0     0 -121.271004  41.781994 -999
    435830  0.0     0 -121.270874  41.781868 -999
    435831  0.0     0 -121.270760  41.781761 -999

References

Links Site
dtypes pandas.pydata.org
astype pandas.pydata.org
info() pandas.pydata.org
Data types numpy.org