Not long ago, I found myself late to a discussion about whether a person's favorite team being good when they were a kid makes them more likely to be a fan as an adult. The natural corollary would be that kids growing up around a bad team would be less engaged with the team later in life. This clearly isn't the case, as the Cleveland Browns continue to play in front of dozens of people (shots fired).
But whether people have childhood memories of joyful success or soul-crushing sadness of their favorite team may affect how big of a fan they are - their magnitude of fandom, so to speak. By treating attendance as a proxy for magnitude of fandom in a region, I was able to measure whether childhood memories, good or bad, have an affect on adults.
I expanded a (pretty good!) season-long attendance regression I built while writing for BSL to include combinations of new parameters that measure extreme seasons 20 years in the past. To make sure I captured a chunk of a fan's formative years, I used a range of 5 years around the season 20 years ago. Parameters for predicting attendance in the 2014 season would be 1992-1996, for example. The parameters listed below were selected as examples of extreme seasons, both successful and not:
- >= 100 wins
- playoff berth
- World Series berth
- <= 65 wins
- division cellar
I built multiple regressions with varying combinations of these parameters, which all seemed to predict attendance pretty well. The chart below shows the ability of each regression, categorized by parameters used, to forecast attendance:
...but there's a real issue with the inclusion of these parameters. While the scores look great and the extreme memories may make sense to some, the coefficients were generally opposite of what would be expected. For instance, if you were to believe that a childhood World Series berth makes people more likely to be fans, you would expect it to carry a positive coefficient. That would indicate that it is a positive event that drives attendance among adults. The regression that included parameters for childhood World Series and cellar-dwelling memories that has an R^2 value of nearly 0.98 assigns the following coefficients (shown in green against the blue "no memory" control regression) to its parameters:
These coefficients are entirely out of sync with our expectations; they suggest that teams winning the pennant or even making the playoffs as a wildcard that year are pushing fans away, and memories of a World Series berth make people incredibly unlikely to go to games as an adult. It doesn't matter what its alleged predicted value is. This model is not grounded in any reasonable hypothesis and it's overfit to the data added to it.
The only regression that appeared to have any semblance of modeling reality is one that only includes memories of the team in the cellar:
No coefficients radically changed direction, although it appears additional emphasis is placed on rostering a All Star starter and winning a wild card at the expense of winning a pennant. And it seems to predict that high beer and hot dog prices attract more fans. At least memories of being in the cellar pushes people away from attending.
I tend to favor the original model that doesn't include extreme memories, in particular because I don't believe very young fans know the difference between their team being in 4th or 5th in their division. In fact, I don't think very young fans care very much about the on-field product, and are more likely to harbor positive memories of spending time with their family at the ballpark. Those memories are far more likely to make adults want to return the favor with their own kids.
And anyway, as always, the most important coefficient is how well the team is doing in that season. It makes sense. It's not hard to tell over 162 games whether a team is good, and fans want to see an exciting team that gives them a good chance to go home happy.
In other words, you're all bandwagoners.