Building Data-Driven Early Intervention Systems for Police Officers
Adverse events between police and the public, such as deadly shootings or instances of racial profiling, can cause serious or deadly harm, damage police legitimacy, and result in costly litigation. Evidence suggests these events can be prevented by targeting interventions based on an Early Intervention System (EIS) that flags police officers who are at a high risk for involvement in such adverse events. But today’s EISs are rarely accurate, let alone early, which results in departments wasting limited resources intervening on the wrong officers while failing to prevent adverse incidents.
There are two types of Early Intervention Systems in use:
- Threshold systems flag officers as high risk if the officers have a minimum count of certain events in a given period of time, such as 3 complaints in 90 days.
- Outlier systems flag officers as high risk if the officers have an unusual number of certain events compared to other officers.
Unfortunately, both of these systems have several problems:
- Threshold and outlier systems are less accurate than adaptive, data-driven systems.
Because they are so simple, threshold systems eliminate important context. For example, CMPD’s system uses the same thresholds for all officers, whether they work the midnight shift in a high-crime area or the business district at noon. Outlier systems are better because they compare officers in similar assignments, but they also fail to account for important factors, such as the level of risk the group faces (e.g. the lowest-risk officer in a high-risk unit may be at higher risk than the highest-risk officer in a low-risk unit.) Most thresholds and outliers are chosen through expert intuition rather than predictive accuracy.
- Threshold and outlier systems are more difficult to customize and maintain. Threshold and outlier systems give binary, yes/no flags, and intuition-based systems require experts to continually monitor and modify the thresholds and outliers they use. At least one vendor hard-codes thresholds into their systems, making changes difficult and costly — which is good for the vendor but bad for the department.
- Threshold and outlier systems are easily gamed. Multiple departments have raised this concern. Because the thresholds and outliers are so easy to see and understand, officers can modify their bad behaviors slightly to avoid detection.
We use machine learning to build a better EIS. Machine learning is the ability of a computer to learn patterns in data and to use those patterns to make accurate predictions. It has been used over the past 30 years to solve thousands of problems, from driving cars to providing search results on Google. Machine learning can handle complex data from disparate sources, including dispatches, arrests, field interviews, training, demographics, crime, neighborhood features, and police and media narratives. This enables not only more accurate predictions but also a better understanding of what puts officers at risk.
Our system offers more flexibility. Our system provides continuous risk scores rather than binary flags. Risk scores enable the department to rank all officers by risk, to explicitly choose tradeoffs (more correct flags versus more incorrect flags), and to allocate by resources rather than prediction (for example, deciding which officers should receive department-wide training first versus deciding which officers should receive training).
Unlike threshold and outlier systems, our machine learning system allows the department how far ahead the system should predict, down to the next dispatch. This gives the department the ability to decide if it should send lower-risk officers to a call even if they’re farther away.
We have demonstrated our system’s performance in two departments. The Charlotte-Mecklenburg Police Department (CMPD) and Metropolitan Nashville Police Department (MNPD) gave us data on officer attributes (e.g. demographics, join date), officer activities (e.g. arrests, dispatches, training), and internal affairs investigations (the case and outcome). We supplemented their data with publicly available sources, such as American Community Survey data, weather data, shapefiles, and quality-of-life surveys. We then simulated history by showing how our system would have done if each department had used it.
We used CMPD’s EIS thresholds as the baseline for both departments (MNPD doesn’t have an EIS.) Our system correctly flags 10-20% more officers who go on to have adverse incidents while reducing incorrect flags by 50% or more.
The work for each department focused on slightly different tasks. With CMPD, we predicted all major adverse incidents, ignoring more minor violations such as uniform issues. With MNPD, we predicted sustained complaints and disciplinary actions.
We have engineered our system for scalability. Once we get the department’s data into our format, our code can build models. This makes it faster, easier, and cheaper to implement.
What We’re Doing
- Implementing the system at our partner departments. We are helping CMPD and MNPD integrate our system on their computers.
- Implementing the system at new departments. The Knoxville Police Department (KPD) and Los Angeles Sheriff’s Department (LASD) have agreed to share data. We are integrating KPD’s data while LASD finishes gathering their data.
- Improve the system. Our system continues to improve in ways that would be difficult for a threshold or outlier system, such as the following:
- Supervisor feedback: Supervisors have information about officers that our system does not, so our web interface will let them give on-the-spot feedback about the quality of predictions and learn from it.
- Department-wide risk changes: Officer risk can increase and decrease together, such as after the Dallas police shootings. We are incorporating news stories into the system to account for these shifts.
- Group predictions: Groups of officers may be higher or lower risk, and intervening on a group may be more successful than intervening with individual officers. Our system will provide department leaders with group-level predictions as well.
Are You Our Next Partner?
We’d like to partner with more departments that have the data, staff, resources, and willingness necessary to adapt and implement the model. To do this project, you will need at least three years of individual-level data:
- Officer ranks and assignments
- Internal affairs investigations and outcomes
- Data on whatever you want to predict (e.g. if you want to predict officer injuries, you need to share officer-injury data)
- Officer department violations
- Traffic stops
- Pedestrian stops
- Firearm use
- Response to resistance/use of force
- Citations written by the officer
- Field interviews
- Raid and search
- Knock and talk/stop and frisk activities
- District / beat boundaries
- Department policies and procedures
You will get more from the model if you also provide the following:
- EIS flags
- Officer education
- Race / ethnicity
- Marital status
- Age or DOB
- Secondary employment
- Officer criminal history
- Performance evaluations
- Psychological evaluations
- Veteran status
- Driving record
- Courses taken
- Each officer’s training officer
- Sick time, vacation time, overtime
- Claims and lawsuits
- Suspect info (demographics, possession of drugs, mental disorder, etc.)
- Calls for service and clearance
- Gang territory shapefiles
We have put this information and more into a spreadsheet here. We wrote a short note on computational requirements here. We posted directions on how to dump databases here. You can find copies of our standard contracts here.
You can hash officer identities (badge numbers, employee numbers, etc.) to provide another layer of protection while allowing us to match officer records across data sources.
Ready to Contact Us?
If you think you fit, please let us know.