27 March 2017

An April Win is Like Any Other Except Maybe the Orioles' April 2016 Wins

"April is the cruelest month, breeding lilacs out of the dead land,
mixing memory and desire, stirring dull roots with spring rain."
T.S. Eliot, the Wasteland

Spring is full of hope.  That is certainly true with baseball where projections and predictions can be forgotten and a vision of success can intoxicate the mind.  Wins can be exaggerated in importance and losses can be overlooked.  The mind can be protected by a shield of acknowledged small sample sizes or the sword of actual successful outcomes.  As they come, April is one of the cruelest for a baseball fan as it plays most with their ability to fully comprehend the talent of their chosen club.

A few weeks back, the site discussed the importance of a successful April slate of games.  The basic conclusion was that if a team wins a lot in April then they will have a great shot at participating in the playoffs.  This was a simple conclusion.  We should expect good teams to win whether in April or May or whenever.  On any given day, the teams that win are likely to be good teams because good teams win.

With that dip into frigid February waters, I wanted to take a different look at the importance of games played by month. I wanted to compare the importance of winning games in April vs. any other month.    We know that good teams win, but does a particular month highlight which team is good?

Replacing the analysis I did on Monday, which I think was impacted by toddler brain rot, I went at it a couple different ways with best fit models using league data over the last three seasons.  One, comparing the month record to the total record.  Two, comparing the cumulative month to total record.  Three, comparing the cumulative month to cumulative total of games for the rest of the season.

April 0.26 0.26 0.07
May 0.32 0.50 0.15
June 0.39 0.71 0.23
July 0.20 0.80 0.24
August 0.56 0.93 0.18
September 0.42 1 NA

With the monthly data, April has the second least correlation to the final season winning percentage.  Cumulatively, we see what we would expect with certainty increasing as games played increases.  Pre vs Post data suggests that much of the certainty we have is simply put in that games have counted as opposed to gleaning much information into what will happen.

Let's shake it another way.  How does winning percentage in April compare to the season end total?

>550 .587 .086 .773 .476
450-550 .500 .104 .714 .292
<450 font=""> .423 .108 .583 .217

I used a winning percentage of .550 to represent a likely wild card club or better.  That is roughly 90 wins.  What we see is that all three batches produced at least one team that looked interesting in April.  However, over the past three seasons there has been no terrible April performers who wound up with 90 wins or more.

If we look only at clubs with a .476 winning percentage or greater in April, we cover those 19 high quality clubs, but also 39 more lesser teams.  What this means is that historically, an April winning percentage of .476 or more has meant that you have a 33% chance at 90 wins.  On the flip side, no team below .476 had 90 wins.

Another angle to look at would be in division games.  One might expect that a certain number of teams may actually be impacted with team matchups.  We should think that not all games are created equally.  While overall record is what decides who gets into the playoffs, that is largely impacted by the leap frogging within the division.  We might well expect that a division win or loss is more impactful than a win or loss outside of the division. 

For the 2016 Baltimore Orioles, they will be facing in division competition 20 out of 23 games this April.  This compares unevenly with last year's 12 out of 23 games.  In fact, the Orioles' crunching of division rivals is not really all that common amongst teams.  This not common occurrence may well be spread rather evenly amongst teams and would be obscured in the exercise above.

Therefore, I wanted to compare in division vs out of division records against what really matters: games back.  In this analysis, I did not look at MLB records in 2016.  Instead, my data set contained all AL East teams from 2012 until last season.  What I found out was that a divisional game was about 29% more important than a non division game.

Anyway, chew on that for awhile.


Roger said...

I think the straight analysis misses several other things. The relative importance of a months worth of games changes whether you have been winning or losing in the prior months. For example, take the raves last year. They were totally out of the competition by about June, so when they began winning in the second half, their games really had no meaning at all. However, turn that around and going a little over .500 in the first half, the second half games have a lot of meaning even you eventually turn into a last place team - at least until you have lost enough to be out of the competition. Further, having a good 1st half often determines strategies going forward such as how to act about the trade deadline. Factor in those contextual items and April games may be as important or more than September games which only matter if you are in the mix for a playoff spot. The O's had a great April and first half last year and acted like a first place team. The year they lost 23 straight to begin the year, they were basically a .500 team after that but no one cared because they were already out of it.

Jon Shepherd said...

That comments makes me think I did not explain the subject well.

Anonymous said...

You have to be smarter than the equipment you are trying to operate, don't sweat it!

Jon Shepherd said...

Roger, let me draw out my comment above. This model obviously is not assessing any chain logic. Do in season events alter strategies? Yes. Do in season changes in strategy create a large enough impact to be observed on a population basis? No.

That does not mean something meaningful may have happened mid season, but it may mean that in season changes rarely impact the fortunes of a club.

Now what may make the first effort cloudier than the second effort is that the first study us comparing games won to overall games won. In other words, are we simply counting for the sake of counting. I could see that redundancy as a potential issue. Another way to look at it would be a simple comparison of month against total. That framework for 2016 would make May, June, August, and September as meaningful, April less meaningful, and July a near complete wash.

Roger said...

So far, O's making the most of those important April games..... go O's!!!