I figured I'd take the passive approach for a new blog idea and let my attention flow naturally to what's most interesting to me at the moment. With my time (and cash) vested in an office Fantasy Football league, my mind has been fixed on NFL data as of late, despite all the upstanding citizens the NFL employs.
Assault, DUIs, performance enhancing substances, shootings and murder aside, the NFL kicked off its new season last weekend. That means my fandom experience is back in full gear. Sometimes, it's great, but most of the time it's hard being a New York Giants fan. I'm referring to the NY Giants football team from the East Division of the National Football Conference (NFC) division in the National Football League (NFL). I say it can be great because we've managed to pull off (basically steal) 2 Super Bowls in the last 10 years, but most of the time it's really hard due to games like the one I begrudgingly sat through last Monday.
I digress, last Monday, while watching my beloved NY Giants [compete?] against the Detroit Lions, I saw something that I just had to try and recreate in Tableau. I set my intention to recreate a drive chart by using the play-by-play data from the game.
The play-by-play is a linear view of the game, ordered by the quarter (4 total) and time (15 minutes per quarter), and detailed with the down, 1st down yards remaining, starting field position, and play detail (players involved, play activity (pass/rush/etc), and other important details (touchdowns/penalties/etc). Here's an example from ESPN:
A very helpful view to visualize the play-by-play detail of a game is the drive chart. The drive chart shows where, on the field, the team with possession starts out and how far they make it down the field either before scoring or turning the ball over. Below is a drive chart, also from ESPN, for the 1st quarter of the game. As you can see, the Detroit Lions (in blue) dominated with 2 scoring drives while the NY Giants (in grey) were forced to turn the ball over through punting on both drives. Now, please, use your vivid imagination to picture me losing my patience and my interest in the game at this point.
I grabbed the ESPN play-by-play data and dumped it in Excel. I created two columns, one for the down detail and another for the play detail. The data looked like this after I cleaned it up a bit:
Now for the fun part! I took the down detail and play detail, extrapolated certain parts of it to new columns for a more database-friendly view of the data, and created some additional columns not provided by the play-by-play data to help me achieve what I needed to create a drive chart in Tableau. The most important new field I had to create is the "Path" field (column D). This field allows Tableau to assign the data with a start and end point for the same field. For example, rows 2 and 3 provide the same start (columns G and H) and end (columns I and J) field positions but are broken out into two separate rows that Tableau can use to reference. The starting field position (NYG 35 yard line) is assigned the path of "0" and the end field position (DET 0 yard line) is assigned the path of "1" in Tableau.
Here's where the magic happens. I have to somehow translate the available data into a start and end position on an X axis for Tableau. Tableau needs to know which side of the field belongs to NY vs Detroit, as well as what happens when the starting field position is on the NY or Detroit side. This is where my fancy calculated field comes into play. It helps Tableau identify which columns to use and what to do with the values when the path is 0 for the starting point on the field, and the same for a path of 1 for the end point on the field. I sat down with a pencil and paper to figure this one out.
Needless to say, I was pleased with the result because it factored in the full 120 yard size of the field, with each end zone having 10 yards, and it allowed me to recreate a football field in Tableau using a bunch of reference lines and bands with colors, line sizes, and shapes to indicate different activities of the play-by-play!
Now that I had that item tackled, no pun intended, I created many more calculations based on the activity/play detail/etc. Adding these proved to be difficult, especially when trying to tie my numbers with those reported by ESPN. You may see some slight differences here and there, but the general idea remains the same - the Giants got their asses handed to them by Matt Stafford and Megatron. I also added a filter across the entire dashboard for quarter (mostly to filter on the second quarter and give myself some relief that the Giants at least dominated in score, though not even offensive yards...) and used dashboard actions to make every data point interactive across all applicable charts. Here's the final product... enjoy!
 





















