Introduction
Creating a GeoDataFrame with Polygon geometry from a simple Pandas DataFrame involves a few steps. These steps include preparing your data, importing necessary libraries, creating the Polygon geometries, and finally constructing the GeoDataFrame. Here's a step-by-step guide to accomplishing this:
Create a Pandas DataFrame with coordinates
First, ensure your Pandas DataFrame contains the coordinates necessary to construct your Polygon. These coordinates should be in a format that can be interpreted as points to make up the Polygon(s). Typically, this would be columns for latitude and longitude, or a single column with tuples/lists containing both values.
import pandas as pd
data = {'city_name':['Paris','London','Moscow', 'Istanbul'],
'longitude_c1':[2.3522-0.1,-0.1276-0.1,37.6173-0.1,28.9784-0.1],
'latitude_c1':[48.8566-0.1,51.5072-0.1,55.7558-0.1,41.0082-0.1],
'longitude_c2':[2.3522-0.1,-0.1276-0.1,37.6173-0.1,28.9784-0.1],
'latitude_c2':[48.8566+0.1,51.5072+0.1,55.7558+0.1,41.0082+0.1],
'longitude_c3':[2.3522+0.1,-0.1276+0.1,37.6173+0.1,28.9784+0.1],
'latitude_c3':[48.8566+0.1,51.5072+0.1,55.7558+0.1,41.0082+0.1],
'longitude_c4':[2.3522+0.1,-0.1276+0.1,37.6173+0.1,28.9784+0.1],
'latitude_c4':[48.8566-0.1,51.5072-0.1,55.7558-0.1,41.0082-0.1]
}
df = pd.DataFrame(data)
print(df)
This will give us the following output:
city_name longitude_c1 latitude_c1 longitude_c2 latitude_c2 \
0 Paris 2.2522 48.7566 2.2522 48.9566
1 London -0.2276 51.4072 -0.2276 51.6072
2 Moscow 37.5173 55.6558 37.5173 55.8558
3 Istanbul 28.8784 40.9082 28.8784 41.1082
longitude_c3 latitude_c3 longitude_c4 latitude_c4
0 2.4522 48.9566 2.4522 48.7566
1 -0.0276 51.6072 -0.0276 51.4072
2 37.7173 55.8558 37.7173 55.6558
3 29.0784 41.1082 29.0784 40.9082
Constructing our Polygon(s)
Now, with this DataFrame in place, we can begin constructing our Polygon(s) using the coordinates provided. We will be using the shapely library for this task, which provides various geometric operations on these coordinates.
There are various approaches to create our Polygons, depending on the size of the dataframe. While creating a function is a common approach, it can become slow as the dataframe size increases. In such cases, using the map() function is recommended.
Approach 1: Creating a Python function
To begin, we need to add a new column to our dataframe that will hold the points required for constructing our Polygon(s). This can be achieved by utilizing the apply function.
Here's an example of how to accomplish this using the shapely.geometry module:
from shapely.geometry import Polygon
def create_polygon(x):
p1 = [x['longitude_c1'],x['latitude_c1']]
p2 = [x['longitude_c2'],x['latitude_c2']]
p3 = [x['longitude_c3'],x['latitude_c3']]
p4 = [x['longitude_c4'],x['latitude_c4']]
pixel_polygon = Polygon([p1,p2,p3,p4])
return pixel_polygon
This will take each row in our dataframe and use the longitude and latitude columns to create a list of tuples, which can be interpreted as points
This code snippet iterates through each row of our dataframe and utilizes the longitude and latitude columns to generate a list of points. By invoking the function,
df.apply(create_polygon, axis=1)
a pandas.core.series.Series object is returned:
0 POLYGON ((2.2522 48.7566, 2.2522 48.9566, 2.45...
1 POLYGON ((-0.2276 51.4072, -0.2276 51.6072, -0...
2 POLYGON ((37.5173 55.6558, 37.5173 55.8558, 37...
3 POLYGON ((28.8784 40.9082, 28.8784 41.1082, 29...
dtype: object
Now, we have the opportunity to add a new column to our dataframe, which we can name "geometry":
df['geometry'] = df.apply(create_polygon, axis=1)
We can also visualize our Polygon(s) using the matplotlib library, which provides various plotting functions.
import matplotlib.pyplot as plt
plt.figure() # creates a new figure to plot on
for polygon in df['geometry']:
plt.plot(*polygon.exterior.xy) # plots the exterior coordinates of each Polygon object as a line
plt.title('How to create a GeoDataFrame with Polygon geometry \n from a Pandas DataFrame with coordinates ?')
plt.savefig('geopandas_polygons_01.png', dpi=100, bbox_inches='tight')
plt.show() # displays the plot
The resulting visualization will show us the boundaries of our Polygons, based on the given longitude and latitude coordinates. We can also add other data to our Polygons, such as labels or colors, to further customize the plot.
Approach 2: Utilizing the map() function
The previous approach works best for smaller DataFrames. Another solution involves utilizing the python map function to generate a column of polygon geometry. Now, let's explore how this can be achieved using our example
First, let's extract the coordinates from the dataframe and store them in a numpy array. We can achieve this by selecting the columns 'longitude_c1', 'latitude_c1', 'longitude_c2', 'latitude_c2', 'longitude_c3', 'latitude_c3', 'longitude_c4', and 'latitude_c4'. Then, we can convert this selection into a numpy array using the to_numpy() function:
Coords = df[ ['longitude_c1', 'latitude_c1', \
'longitude_c2', 'latitude_c2', \
'longitude_c3', 'latitude_c3', \
'longitude_c4', 'latitude_c4'] ].to_numpy()
The above code will display
array([[ 2.25220e+00, 4.87566e+01, 2.25220e+00, 4.89566e+01,
2.45220e+00, 4.89566e+01, 2.45220e+00, 4.87566e+01],
[-2.27600e-01, 5.14072e+01, -2.27600e-01, 5.16072e+01,
-2.76000e-02, 5.16072e+01, -2.76000e-02, 5.14072e+01],
[ 3.75173e+01, 5.56558e+01, 3.75173e+01, 5.58558e+01,
3.77173e+01, 5.58558e+01, 3.77173e+01, 5.56558e+01],
[ 2.88784e+01, 4.09082e+01, 2.88784e+01, 4.11082e+01,
2.90784e+01, 4.11082e+01, 2.90784e+01, 4.09082e+01]])
By examining the shape of our matrix,
Coords.shape
, we observe that it consists of 4 rows with 8 coordinates, resulting in a shape of
(4, 8)
However, the shapely Polygon function (shapely.Polygon) requires the coordinates to be in the shape (N,2) (see shapely.Polygon). To meet this requirement, we can easily reshape our matrix by executing the following line of code:
Coords = Coords.reshape(4,4,2)
Consequently, our matrix now has a shape of (4,4,2), signifying 4 rows with 4 points, each defined by 2 coordinates representing longitude and latitude.
To create our polygons, we can utilize the built-in Python function map() as follows:
polygons = list(map(Polygon, Coords.tolist()))
This code generates a list of four polygons:
[
<shapely.geometry.polygon.Polygon at 0x7f7e1014f700>,
<shapely.geometry.polygon.Polygon at 0x7f7e08f37820>,
<shapely.geometry.polygon.Polygon at 0x7f7e1014fe50>,
<shapely.geometry.polygon.Polygon at 0x7f7e1014fd90>
]
We can utilize this list to generate a new column in our dataframe, such as "geometry," which will hold our polygons.
df['geometry'] = polygons
To display the dataframe, we can use the print function:
print(df)
The above code will display
city_name longitude_c1 latitude_c1 longitude_c2 latitude_c2 \
0 Paris 2.2522 48.7566 2.2522 48.9566
1 London -0.2276 51.4072 -0.2276 51.6072
2 Moscow 37.5173 55.6558 37.5173 55.8558
3 Istanbul 28.8784 40.9082 28.8784 41.1082
longitude_c3 latitude_c3 longitude_c4 latitude_c4 \
0 2.4522 48.9566 2.4522 48.7566
1 -0.0276 51.6072 -0.0276 51.4072
2 37.7173 55.8558 37.7173 55.6558
3 29.0784 41.1082 29.0784 40.9082
geometry
0 POLYGON ((2.2522 48.7566, 2.2522 48.9566, 2.45...
1 POLYGON ((-0.2276 51.4072, -0.2276 51.6072, -0...
2 POLYGON ((37.5173 55.6558, 37.5173 55.8558, 37...
3 POLYGON ((28.8784 40.9082, 28.8784 41.1082, 29...
This approach can be beneficial for larger data sets, as it uses the efficient numpy library and the built-in map function.
Overall, there are multiple ways to generate polygons from a dataframe containing longitude and latitude coordinates. The chosen method will depend on the specific project requirements and the size of the dataset.
Converting our Pandas DataFrame into a GeoDataFrame
To convert our Pandas DataFrame into a GeoDataFrame, we can easily achieve this by using the following code snippet.
import geopandas
gdf = geopandas.GeoDataFrame(
df,
geometry=df['geometry'],
crs="EPSG:4326"
)
Here, we are specifying the geometry column as "geometry" and assigning it a coordinate reference system (CRS) of EPSG:4326, which is commonly used for latitude and longitude coordinates. This will convert our dataframe into a geodataframe with polygon geometries in the "geometry" column.
References
Links | Site |
---|---|
shapely.Polygon | shapely.readthedocs.io |
map() | docs.python.org |