09 June 2016

Forecasting Single-Game Attendance

I've explained before how season-long attendance can be forecast with a handful of different variables, all of which relate to the quality of the team. Since the team is the on-field product that fans buy when they attend games, it makes sense that team quality drives attendance. Sure, games are a form of entertainment, but as many Orioles fans can attest, it's a lot more fun to watch a winner.

To continue this practice of forecasting attendance, I decided to take a look at what factors influence single-game attendance at Camden Yards. Perhaps this work would qualify Camden Depot to help the Orioles plan promotional night or concession inventory (always buy 1 extra bobblehead for me, Orioles)!

The features I decided to build my regression on were pretty straightforward:
  • Day of week
  • Month of year
  • Year
  • Opening Day
  • Day/night game
  • Game-time temperature
  • Weather condition (rain, sun, overcast, drizzle, or dome)
  • Winning percentage (or pythagorean winning percentage)
  • Runs per game, for and against
  • Current streak type (has the team won or lost the last few games they've played?)
  • Games back in the division
  • Opponent
  • Promotional night
  • Kids' promotional night
  • Size of promotion/number of items given away
  • Family night/field trip days (kids run the bases, etc.)
  • Fireworks night
  • Gender-specific promotional night (Mother's and Father's Day, specifically)
These are all pretty top-line features. They're generally pretty available (old promo night details are hard to find), nothing is super-SABR, and it makes sense that any of them might affect attendance.

I ended up dropping day/night game, runs for/against, family nights, fireworks nights, and gender-specific promotional giveaways because they all tended to be on Fridays, Saturdays, and Sundays, which introduces some complications because day of week was already included. I also dropped the year from the analysis because I was concerned that it was too similar to winning percentage.

I chose to use a lasso regression model, because it (in summary) zeroes out coefficients that aren't actually significant to attendance. And interestingly, a lot of coefficients were zeroed out: most teams, some months, some weather conditions - and winning percentage. The final list of included and meaningful coefficients is shown here:

As you can see, this is a much shorter list than we started out with! I want to call out the teams that affect attendance at Camden Yards when they come to town:
Within the AL East, the Yankees and the Red Sox cause attendance to spike, while the Rays drive attendance down. That makes sense with our understanding of fan tendencies: Rays fans don't go to games anywhere, much less hundreds of miles from home, and Yankees and Rays fans are both everywhere and willing to travel to see their teams at Camden Yards. Being the Orioles' primary rivals, I'm sure their presence makes O's fans want to show up as well.

Other nearby teams, the Phillies and the Nationals, also drive attendance up. The Nationals' annual visit to Baltimore is the largest single-game attendance boost from a team.

The White Sox probably don't really drive people away from the ballpark. They simply have the misfortune of being the team that was in town following the unrest in Baltimore City and, as a result, the team that played the Orioles in front of an empty stadium.

The most surprising thing for me was the effect. or lack thereof, of playing winning baseball on attendance. While a good team draws more fans throughout the year, it seems that fans are drawn to the ballpark day to day for how well it works with their social schedule. Weekends are popular draws, as are the summer months, and people show up for promotions and family nights. Interestingly, attendance also decreases as temperature rises (I suspect that that function is parabolic, with a nice warm day drawing more fans than either a cold one or a very hot one). But how good the Orioles are seems less important than how convenient it is to attend a game.

To be fair, this regression uses data from 2012 to 2015, years in which the Orioles were generally pretty good. Winning percentage, or pythagorean winning percentage, might be a stronger predictor of attendance if some of the more turbulent years were included as well. So how well does it work?
It does a decent job, but we're within, usually, 10,000 attendees one way or the other. Not terrible, but considering that some games have as few as 15,000 attendees, this might not be the model to base inventory plans on. This regression has an R^2 of 0.635, indicating that this model describes 63.5% of the variation in single-game attendance. Another limitation, as far as long-term inventory planning goes, is that this model uses weather as a predictor of attendance - which it most certainly is! Walk-up ticket sales spike when it's nice out, but the weather probably isn't something concessions stands would know far enough in advance to use to their advantage.

1 comment:

Boss61 said...

Many years ago in one of his first, annual "Baseball Abtract" books, SABR-guru Bill James published a memorable piece on "Factors that Govern Attendance at Seattle Mariners baseball games. The Mariners had been in existence less than 10 years at the time and played indoors at the Kingdome.

James observed that the team lacked the rich history to establish rivalries, and also lacked the parent-child bonding experience of baseball, as Seattle dads did not have the chance to tell their offspring about the Mariners of there youth as there had been no Mariners back then. James concluded that the existence and quality of a Mariners promotion was the best predictor, followed secondly by the weather. Indoor baseball is a better draw when the weather outside is wet than (on days few and remarkable in Seattle) when it is nice out.

I've been a O's partial plan holder for decades and now apply your criteria on myself. My best predictor is whether the O's have assigned me that given game. But I do trade my tickets in for other games too. I tend to dump my April, May and September weeknight games because school is in session. I tend to pick up more for Fireworks Nights (easily the O's best promotion to us) and here is something you missed - when a hot pitcher on an opposing team is starting for the opposition because I want to see him. Chris Sale, Justin Verlander, Cole Hamels, Sonny Gray, etc. all have had me steer my ticket selection to see them pitch.

Great article; hope my comments help you.