Tuesday, September 17, 2013

Predictive Modeling of S.F. Crime: Trends

So far, my model generates crime predictions through statistical applications of historical data. Calling it predictive is a stretch because it assumes a stationary, homogeneous environment for crime. Some examples are that daily forecasts do not account for day of the week volume differences or long term crime trends. I explored that a little in the data today and produced some charts to that end.

Long-Term Trends:

Monthly volume was compiled into a list. The result is the plot below, and a best fit trendline that shows crime over the last 11 years has been approximately decreasing at a rate of 23 crimes per month. Significant variance exists on a monthly basis, but there is definitely a long-term decreasing trend which could be included in a model forecasting many years out.
















Day of Week Trends:

Below is a barplot that takes crime volume by day, by year from 2003 to 2013. That is, the first seven bars correspond to the number of crimes on Friday, Monday, Saturday, Sunday, Thursday, Tuesday and Wednesday occurring in 2003.






















More demonstrative is to take the mean number of crimes for the days of the week in each year and calculate the distance from that mean. Let's say Friday has 1000 crimes, Monday-Thursday have 500 crimes, Saturday and Sunday have 700 crimes each. The total # is 4400 crimes in the week. The mean is 628. Put another way, 628 crimes occurring each day of the week would reach that [cough, rounding] same number (7*628 ~ 4400). The barplot below takes the distance by day from the weekly mean for each year. We see that Friday and Wednesday frequently lie above the mean, while Sunday is uniformly below the mean.














Takeaways:
1. A long-term decreasing trend would be a useful inclusion to multi-year forecasts. This trend is not strictly decreasing and deviates significantly on a monthly basis.
2. Daily differences matter. Friday has more crime and Sunday has less crime. Including this to differentiate a Friday forecast from a Sunday forecast can solve the ambiguous and interchangeable format of current forecasts.



No comments:

Post a Comment