Let's pull all
SF Crime data provided by SF data:
Let's pull it in and peek at the schema.
The first recorded request is
2003-01-01 and most recent is
Let's get a more detailed view by examining
Descript, which is the particular crime type.
912 different crime types, let's slice by percentile and peek at the top types of crime for each
Cluster the non-normalized data across the top percentile reports and each
Normalize verically across
Normalize horizontally across crime types.
(1) GTA is the most common crime in most
(2) For the distribution of crime across areas:
Lets, re-examine the crime types.
I'm interested in
We can use what we had above, but we simply slice the input data on a category first (above).
Nice. We could study these for a while.
But, here's the point:
I think we can simplify this if we compress different types of drug into groups.
Then, we can examine both temporal and spatail profiles.
We'll create a 30 day window.
Let's group the drug categories to make this easier to examine.
Let's add the real dates.
Let's iterate through each district.
We can also look at correlations between areas for different drugs.
With this in mind, we can examine select timeseries data.
Let's re-do what we did above, but re-scale it.
We can now summarize this data using clustered heatmaps.
Let's isolate all crack-related records.
Plot the crack regimes.
Fold-difference in mean between the two regimes.
We can look at this spatially.
Use a shapefile for Neighborhoods in SF to overlay the data onto a map.
Basemap can be used to view this. Some nice work at this link that I drew from:
We can use the