LEARNING BY TESTING: Understanding Successful Police Field Experiments


The movement to reform policing through the deployment of scientific approaches – “evidence-based policing” (EBP) – has grown in importance in the last few years (Weisburd and Neyroud, 2011). For some scholars, a central feature of EBP is the importance attached to randomised controlled trials. RCTs are controversial with others who argue that policing is not comparable to medicine and that RCTs are unable to reflect the complexity of the police role and context (Cockbain and Knutsson, 2014). Even those who advocate the use of RCTs recognise that there are significant challenges in achieving the high dosage and high fidelity that a successful experiment requires (Dennis (1988), Berk (2005) and Goldkamp (2008)).

However, even with the growth of systematic reviews and the growing interest in RCTs we have never had a comprehensive list of those involving the police or an analysis of how successful they have been as experiments.

Having led the National Policing Improvement Agency and having been a major sponsor of systematic reviews of the evidence on policing, my own research, post my police career, has been directed at filling these gaps. After a search for completed Police RCTs, the research has analysed the levels of treatment integrity with a view to building a better understanding of the challenges of conducting and managing successful police field trials.

Searching for Police RCTs

The search started from previous reviews of RCTs in policing and then covered Campbell Systematic Reviews, NCJRS abstracts, Grey literature databases and manual searching of key journals. However, even though I am the co-chair of the Campbell Collaboration on Crime and Justice, I recognize that searching sometimes needs to be intuitive and go “off piste”. One RCT – an early test of two different training methods – was found whilst idly combing discarded library books on sale from a mid-West US university.

The Progress of Police RCTs 1970-2015

The searching revealed that there have been more than 107 RCTs involving the police since 1970. As Figure 1 shows, there has been a small but steady use of RCTs since 1970, but with a significant growth since 2010. From conference papers, Internet postings and personal contact, the research has also identified more than 25 RCTs “in flight”, suggesting that the growth is continuing and may be accelerating. What is changing is the rapid growth of RCTs where one or more of the principle investigators is a serving police officer and the experiment is part of a partnership between a university and a police force. The most obvious illustration of this is the eight RCTs due to report shortly on the effectiveness of Body Worn Video. Given the operational and political sensitivity around this equipment across the world, proper field-testing replicated across different jurisdictions is a high priority.


Analysing Treatment Integrity

However, it is one thing to experiment, quite another to implement the experiment effectively. Analysis of police innovations has tended to find significant problems with implementation. Paying attention to the levels of treatment integrity – the extent to which the treatment intended was assigned and delivered – is a critical measure of the success of an experiment.

In order to arrive at an estimated measure of the overall treatment integrity, all 107 RCTs were analysed by looking for data and commentary in the published reports relating to:

    • % of cases (both treatment and control) that were reported as “treated as assigned”.
    • The level of compliance with the treatment upon which the RCT was designed.
    • There was a wide range of results. Data from 16 RCTs suggests a level of treatment integrity below 60%. This is significant because Durlak and Dupre (2008) in a systematic review of the effect of implementation on program outcomes suggested that positive results could be obtained from implementation levels above 60% and that stronger results were associated with higher levels of implementation. Emerging results from this research suggests that:


    • There were 26 RCT reports where the data to complete this was either absent from the report or insufficient for analysis. This was more common in the earlier RCTs. More recent RCTs have tended to show a CONSORT flow diagram and report levels of treatment compliance.
    • Around 20% of reported Police RCTs fell below the 60% threshold
    • Nearly half of the 107 (53) exceeded 80% treatment integrity
    • The low treatment integrity was not strongly associated with one design or treatment intervention type
    • Attention to tracking, control of randomisation and the strength of the research-practitioner partnership appeared to be important factors

Implications and next steps

The increasing use of RCT designs to test police practices provides a strong argument for learning from the experience of implementation. The lessons almost certainly extend to other field designs such as Quasi-Experiments. The work to develop a Global Police Database of police research since the 1950’s has identified more than 7000 studies with a control in the design.The second stage of this study has used a more detailed analysis of an individual Police RCT. Operation Turning Point, which ran from 2011 to 2014, tested police prosecution against a treatment centred on a deferred prosecution with conditions. Participants in the RCT have been interviewed using a protocol developed from analysis of the completed RCTs. The aim is to develop and set out a working theory on the conduct of successful experiments (Turning Point achieved a level of treatment integrity over 90%).


The research has been designed to draw out lessons from nearly 50 years of experimentation in policing. RCTs – and other research designs – are often reported with a strong emphasis on the levels of statistical significance in the outcomes, without enough attention to the challenges encountered in implementing the treatment. More than 40 years ago, Ron Clarke and Derek Cornish (1972) wrote a short research note about the experience of conducting experiments in criminal justice institutions. They highlighted operational, ethical and methodological challenges that have been much quoted by subsequent scholars. The data from this research is suggesting that many of the risks identified by Clarke and Cornish can be avoided in police experiments by deploying the lessons that can be learnt from half a century of testing.

Peter Neyroud (CBE QPM) is an Affiliated Lecturer and Resident Scholar at the Jerry Lee Centre for Experimental Criminology, Institute of Criminology, University of Cambridge and Co-Chair Campbell Collaboration (Crime and Justice). Email: pwn22@cam.ac.uk  Twitter: @pwneyroud

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s