geoplot is a geospatial data visualization library designed for data scientists and geospatial analysts that just want to get things done. In this tutorial we will learn the basics of
geoplot and see how it is used.
You can run this tutorial code yourself interactively using Binder.
The starting point for geospatial analysis is geospatial data. The standard way of dealing with such data in Python using
geopandas—a geospatial data parsing library over the well-known
geopandas represents data using a
GeoDataFrame, which is just a
DataFrame with a special
geometry column containing a geometric object describing the physical nature of the record in question: a
POINT in space, a
POLYGON in the shape of New York, and so on.
shapely, the library
geopandas uses to store its geometries, uses "modern" longitude-latitude
(x, y) coordinate order. This differs from the "historical" latitude-longitude
(y, x) coordinate order. Datasets "in the wild" may be in either format, so after reading in some data make sure to verify that your coordinates are in the right order!
To learn more about manipulating geospatial data, check out the section of the tutorial on Working with Geospatial Data.
If your data consists of a bunch of points, you can display those points using
If you have polygonal data instead, you can plot that using a
We can combine the these two plots using overplotting. Overplotting is the act of stacking several different plots on top of one another, useful for providing additional context for our plots:
You might notice that this map of the United States looks very strange. The Earth, being a sphere, is impossible to potray in two dimensionals. Hence, whenever we take data off the sphere and place it onto a map, we are using some kind of projection, or method of flattening the sphere. Plotting data without a projection, or "carte blanche", creates distortion in your map. We can "fix" the distortion by picking a better projection.
The Albers equal area projection is one most common in the United States. Here's how you use it with
This looks much better than our first plot! In fact, this is the version of the United States that you'll probably most often see in maps.
To learn more about projections check out the section of the tutorial on Working with Projections.
This map tells us that there are more cities on either coast than there are in and around the Rocky Mountains, but it doesn't tell us anything about the cities themselves. We can make an informative plot by adding more visual parameters to our plot.
We'll start with
This map tells a clear story: that cities in the central United States have a higher
ELEV_IN_FT then most other cities in the United States, especially those on the coast. Toggling the legend on helps make this result more interpretable.
Which colors get assigned to which category are controlled by a colormap (or
cmap). There are over fifty visually distinct colormaps in
matplotlib; it's also possible to create your own on the fly. This plot uses the default
viridis, but we can pick a different one that perhaps better suites our data if we so choose:
Next, let's try adding still another visual variable to our plot:
scale. We'll also specify two other new parameters:
limits, which controls the maximum and minimum sizes of the scaled-out points; and
legend_var, which specifies which visual variable (
hue) will appear in the legend.
This new plot shows more clearly than the previous one the difference in height between cities in the Rocky Mountain states like Colorado, Utah, and Wyoming, and those elsewhere in the United States.
Ugly maps distract the reader from the story you want to tell. Once you've got the basic outline ready, it's handy to be able to tweak your plot a bit to "prettify" it.
geoplot comes equipped with a variety of visual parameters (many of them from
matplotlib) that can be used to adjust the look and feel of the plot.
So far we've worked with
geoplot has a variety of other plot types available as well. We'll take a brief look at just two of them; the full list is covered in detail in the Plot Reference.
choropleth of population by state shows how much larger certain coastal states are than their peers in the central United States. A
choropleth is the standard-bearer in cartography for showing information about areas because it's easy to make and interpret.
kdeplot smoothes point data out into a heatmap. This makes it easy to spot regional trends in your input data. The
clip parameter can be used to clip the resulting plot to the surrounding geometry—in this case, the outline of New York City.