Syllabus

Description

Illustrates the use of statistical modeling and data science techniques to derive actionable insights from sports data. Emphasizes not only technical calculation of advanced metrics but also on written and oral communication to other data scientists and to non-technical audience. Topics may include: deriving team rankings from paired competitions; measuring an individual player’s contribution to their team’s overall success; assessing player performance and team strategy in terms of expected outcomes; forecasting the impact of new rule changes using simulation; and creating new metrics using high-resolution player tracking data.

Learning Outcomes

Throughout the course you will

  1. Implement appropriate statistical methods to assess player and team performance
  2. Work with play-by-play and high-resolution tracking data
  3. Provide constructive and actionable feedback on your peers’ analytic reports
  4. Build a personal portfolio of sports data analyses

Requisites

This course will make extensive use of the R programming language through the RStudio integrated development environment (IDE). Because the formal pre-requisites for this course are STAT 333 or 340, you are expected to have previous experience using the R programming language.

Warning

If you do not meet the formal course prerequisites and/or have not used R in a previous course, this is not the right course for you. Exceptions will not made to the formal course requisites.

I will assume fluency with basic R functionality (e.g., assignment, writing and executing scripts, saving data objects, setting environments, installing and loading packages), data manipulation with dplyr and other tidyverse packages, and visualization using either base R graphics or ggplot2. I will additionally assume some familiarity with fitting statistical models in R and interpreting their output (e.g., using lm and glm). Here is an example of the type of R coding with which I will assume you are familiar. If you need a refresher on some of the data wrangling in that document, I strongly recommend reviewing your old course notes as well as

Course Staff & Office Hours

Instructor: Sameer Deshpande (sameer.deshpande@wisc.edu). Instructor office hours will take place on:

  • Mondays from 11:00am to 12:00pm in Morgridge Hall 5586 (beginning 9/8/25).
  • Wednesdays from 3:00pm to 4:00pm in Morgridge Hall 5586 (beginning 9/3/25)
  • Fridays from 3:00pm to 4:30pm in Morgridge Hall 5618 (beginning 9/5/25)

Office hours on Mondays and Wednesdays are designed for one-on-one meetings with the instructor. While the main purpose of Monday and Wednesday office hours is to discuss your specific experience and learning in the course, feel free to stop by to chat about your educational and career goals and/or any personal sports analytic projects you might be pursuing.

Friday office hours are intended for collaborative and small group work. During Friday office hours, small groups can step through analyses from lecture, work on the exercises, and on their team projects with the instructor and other students. Friday office hours are also a great way to brainstorm project ideas with and to get feedback from other teams.

If you have specific questions about the course content (e.g., parsing a particular bit of syntax or understanding a step in an analysis), I encourage you to ask the question on Piazza and to attend Friday office hours. If you cannot make the Monday or Wednesday office hours, please email me and suggest some times at which you are free.

Teaching Assistant: Zhexuan Liu (zhexuan.liu2@wisc.edu). TA office hours will be held 9:15am - 10:45am on Tuesday and Thursdays in Morgridge Hall 2515. TA office hours will begin on 9/16/25.

Assignments & Grading

Your final grade will be based on your performance on three group projects and your overall participation. For each project, groups can earn up to 100 points for their written report and up to 100 points for their recorded presentation. After each project, every student will peer review three reports and three presentations and complete a team accountability survey. You can earn a total of 10 points for each peer review by (i) completing the provided rubric and (ii) leaving a constructive comment that identities strengths and areas for potential improvement. Students will earn up to 20 points per survey: 10 for completing the survey and 10 based on the feedback from their teammates and their own self-assessment.

We will use Canvas to handle report and presentation submissions, peer reviews, and team accountability surveys.

Group Projects

There will be three group projects. Each project consists of two parts, a written report and a group presentation. The project due dates are October 10, November 7, and December 5.

Written Report

The written report consists of a non-technical executive summary and a technical report. The executive summary, which should not exceed 500 words, should describe the overall goals, analytic approach, and main conclusions in non-technical language. The executive summary should be free from jargon, code listings, figures, tables, and charts. It should be written to be read and understood by a front office executive, coach, player, or fan with little data science experience. The rest of written report should

  • Clearly state the problem being studied and provide sufficient background details and to motivate why the problem is important and interesting.
  • Describe the data and major steps of the analysis
  • Presents the main results within the context of the relevant sport(s) and supports the results with figures, tables, charts, and other statistical software output as appropriate.
  • Discusses the limitations of the analysis and outlines concrete steps for further development.

The technical section of the report should contain enough detail and code that another data scientist could replicate your analysis verify its soundness. Code listings and output (e.g., figures, tables, charts, and numerical summaries) should be tightly integrated with the written exposition. The use of Quarto or RMarkdown is highly recommended for preparing the written report.

Presentation

Each team will also record an 8–10 minute presentation (e.g., using Zoom) that provides an overview of their analysis. Each presentation should include the following elements

  • Background (2–4 slides): clearly motivate and state the main problem being studied. Explain why it is interesting and important. Present just enough background to motivate the problem, while taking care not to overwhelm the audience with extraneous details. If appropriate, comment on the limitations of existing solutions to the problem or closely-related problems
  • Analysis overview (2–4 slides): present only the main steps of your analysis. Be sure to explain why each step was necessary and how these steps contribute to the overall solution. Focus more on the high-level ideas and motivation for each step rather than the specific implementation or software syntax
  • Main results (2–3 slides): distill your results into a few key points. Use figures, tables, charts, and other statistical software output to support your findings.
  • Conclusion (1 slide): briefly summarize your analysis and findings and outline between 1 and 3 specific directions for future development, improvement or refinement.

Peer Reviews

After every group project, every student is expected to peer-review the written report and presentations of 3 other teams. The primary purpose of this peer review is to practice providing constructive feedback on both technical and non-technical writing to your fellow data scientists. Peer reviews will be completed on Canvas and a structured evaluation rubric will be provided for each assignment. The peer review process will be single-blinded so teams will not know the identities of their reviewers. Students can earn a total of 20 points for each peer review by (i) completing the provided rubric and (ii) leaving a constructive comment that identities strengths and areas for potential improvement. Peer reviews are due one week after the project assignment due dates; that is, they are due on October 17, November 14, and December 12. A penalty will be assessed for submitting late reviews.

Team Accountability Survey

After each project submission, you will fill out a team accountability survey. You can earn up to 40 points per project submission: 10 for completing the survey and 10 based on the feedback from their teammates and their own self-assessment. In total, students can earn up to 120 points based on their level of participation in their project groups.

Participation

An additional 100 points will be awarded based on participation during class, office hours, and online discussions on Piazza.

Final Grades

Your final grade will be based on how many of the 1000 total points you earn during the semester. Over 925 points is at least an A, over 875 points is at least an AB, over 800 points is at least a B, over 700 is at least a BC, over 600 points is at least a C, and over 500 point is at least a D. Final grade boundaries will be announced no later than December 5, 2025.

Canvas & Piazza

We will use Canvas for course announcement, assignment submission, and grading. The Canvas course page is accessible via this link. We will also use Piazza for specific discussions about course content (e.g., sharing additional information/resources related to material discussed in lecture; answering questions about data, method, and code; etc.) and more general discussions about sports analytics (e.g., sharing job postings, open data competition opportunities, popular press articles, and other analyses you might find online etc.) The Piazza page is accessible via this link.

Access to Canvas and Piazza

The Canvas and Piazza sites associated with this course are limited to students enrolled in the course and the course teaching staff. Although they can freely access the course notes and code, auditors, UW–Madison students not currently enrolled in the course, and anyone not affiliated with UW–Madison will not have access to Canvas and Piazza sites. Requests to access these sites will be ignored.

Resources

The course will make extensive use of R. The following are excellent references and I highly encourage you to read and consult them as needed:

You may also find the following websites, blogs, and podcasts useful as sources of inspiration as you develop analyses of your own.

  • The online textbook Analyzing Baseball Data with R and the associated blog, which contains lots of helpful resources for analyzing tabular box score data to high-resolution ball tracking data and everything in between
  • Ron Yurko’s Substack [``Statistical Thinking in Sports Analytics’’] (https://statthinksportsanalytics.substack.com)
  • The : the premiere venue for academic sports analytics articles. You should have access to all articles through the UW Libraries. Each issue also features a publicly available ``Editor’s Choice’’ article.
  • The [Open Source Sports podcast] (https://open.spotify.com/show/3vTtH2JJXbjrzOtEfjrqc4): Each episode of this podcast focuses on a single academic research paper featuring authors as guests, with discussions about the statistical methodology, relevance and future directions of the research.
  • Hockey Graphs: a blog that’s developed really innovative public-facing hockey analytics.
  • The [Wharton Moneyball Post Game Podcast] (https://knowledge.wharton.upenn.edu/shows/moneyball-highlights/): a podcast version of the popular Sirius XM radio show hosted by several Wharton professors and featuring interviews with sport analytics thought leaders

Academic Integrity

I take academic integrity extremely seriously. While I encourage you to collaborate with your classmates on the problem sets, you are expected to write up your own solutions. You must acknowledge any and all sources that you consulted, whether it be a reference book, online resource, or another person (in the class or otherwise). You may not post, share, or upload any course material, including lecture slides, code, project reports, and presentation recordings, to websites like StackOverflow, Quora, Chegg, and Reddit or to ChatGPT or a similar generative AI service.

Policy on the Use of Generative Artificial Intelligence Models

You have the right to the full benefits of my expertise and engagement in this course. So, I will never use AI to - Provide feedback on assignments - Prepare any course content (e.g., slides, code, assignments, etc.) - Mediate or assist communications with you In other words, everything you see in this course was created by me without the aid of generative AI.

While the Statistics Department recognizes the potential benefits of AI, its use in academic work can be problematic. Insofar as this course is really about process and not final results, I believe strongly that the use of generative AI is at odds with the learning goals. So, three rules regarding the use of ChatGPT and other generative AI models will be enforced:

  • Passing off AI-generated responses as original student work constitutes plagiarism and is strictly prohibited. Any students found to be engaging in this practice will be cited for academic misconduct.
  • Unless explicitly authorized by the instructor to do so, any use of AI-generated responses as sources of information, even with documentation and attribution, is prohibited.
  • You may not, under any circumstances, upload any course material to a generative AI model or agent. This includes lecture slides and notes and reports or videos uploaded by other student groups.
Peer Review

Respect your classmates enough to review their projects yourself. Uploading someone else’s project report or presentation to a generative AI tool (e.g., for creating summaries) is forbidden and will result in a failing grade.

Additional Information

How credit hours are met

This class meets for two 75-minute class periods each week over the semester and carries the expectation that students will work on course learning activities (reading, writing, problem sets, studying, etc.) for about three hours out of the classroom for every class period.

Regular and substantive student-instructor interaction.

Substantive interaction occurs via two channels: (i) regular class meetings in which students are encouraged to engage in discussion and (ii) weekly office hours.

Ethics

The members of the faculty of the Department of Statistics at UW–Madison uphold the highest ethical standards of teaching, data, and research. We expect our students to uphold the same standards of ethical conduct. The American Statistical Association’s standards for ethical conduct in data analysis and data privacy are available at and include:

  • Use methodology and data that are relevant and appropriate; without favoritism or prejudice; and in a manner intended to produce valid, interpretable, and reproducible results.
  • Be candid about any known or suspected limitations, defects, or biases in the data that may affect the integrity or the reliability of the analysis. Obviously, never modify or falsify data
  • Protect the privacy and confidentiality of research subjects and data concerning them, whether obtained from the subjects directly, other persons, or existing records.

Accommodations for students with disabilities.

The University of Wisconsin–Madison supports the right of all enrolled students to a full and equal educational opportunity. The Americans with Disabilities Act (ADA), Wisconsin State Statute 36.12, and UW–Madison policy UW-855 require the university to provide reasonable accommodations to students with disabilities to access and participate in its academic programs and educational services. Faculty and students share responsibility in the accommodation process. Students are expected to inform faculty of their need for instructional accommodations during the beginning of the semester, or as soon as possible after being approved for accommodations. Faculty will work either directly with the student or in coordination with the McBurney Resource Center to provide reasonable instructional and course-related accommodations. Disability information, including instructional accommodations as part of a student’s educational record, is confidential and protected under FERPA.

Diversity & inclusion

Diversity is a source of strength, creativity, and innovation for UW–Madison. We value the contributions of each person and respect the profound ways their identity, culture, background, experience, status, abilities, and opinion enrich the university community. We commit ourselves to the pursuit of excellence in teaching, research, outreach, and diversity as inextricably linked goals. The University of Wisconsin–Madison fulfills its public mission by creating a welcoming and inclusive community for people from every background – people who as students, faculty, and staff serve Wisconsin and the world.

Privacy of student records & the use of audio recorded lectures.

Lecture materials and recordings for this course are protected intellectual property at UW–Madison. Students in courses may use the materials and recordings for their personal use related to participation in class. Students may also take notes solely for their personal use. If a lecture is not already recorded, students are not authorized to record lectures without permission unless they are considered by the university to be a qualified student with a disability who has an approved accommodation that includes recording ( Regents Policy Document 4-1).

Students may not copy or have lecture materials and recordings outside of class, including posting on internet sites or selling to commercial entities, with the exception of sharing copies of personal notes as a notetaker through the McBurney Disability Resource Center. Students are otherwise prohibited from providing or selling their personal notes to anyone else or being paid for taking notes by any person or commercial firm without the instructor’s express written permission. Unauthorized use of these copyrighted lecture materials and recordings constitutes copyright infringement and may be addressed under the university’s policies, UWS Chapters 14 and 17, governing student academic and non-academic misconduct.

Religious observances

Students are responsible for notifying the instructor within the first two weeks of classes about any need for flexibility due to religious observances.

Further information and policies}.

Please visit this link for additional information about student privacy, course evaluations, and student rights and responsibilities.