Abstract : Using high-resolution player tracking data made available by the National Football League (NFL) for their 2019 Big Data Bowl competition, we introduce the Expected Hypothetical Completion Probability (EHCP), a objective framework for evaluating plays. At the heart of EHCP is the question “on a given passing play, did the quarterback throw the pass to the receiver who was most likely to catch it?” To answer this question, we first built a Bayesian non-parametric catch probability model that automatically accounts for complex interactions between inputs like the receiver’s speed and distances to the ball and nearest defender. While building such a model is, in principle, straightforward, using it to reason about a hypothetical pass is challenging because many of the model inputs corresponding to a hypothetical are necessarily unobserved. To wit, it is impossible to observe how close an un-targeted receiver would be to his nearest defender had the pass been thrown to him instead of the receiver who was actually targeted. To overcome this fundamental difficulty, we propose imputing the unobservable inputs and averaging our model predictions across these imputations to derive EHCP. In this way, EHCP can track how the completion probability evolves for each receiver over the course of a play in a way that accounts for the uncertainty about missing inputs.
This paper contains an analysis that Kathy Evans and I did for the NFL’s 2019 Big Data Bowl. You can find out more about the contest here.
The journal version of the paper is available at DOI: 10.1515/jqas-2019-0050.
A pre-print version of the paper is availabe at arXiv:1910.12337.
The code that we used for this project is available at this GitHub repo.