01 May 2017

You Are Not Your April Record...or Run Differential

This tweet was brought to my attention.  Let us not presume what Melewski's intent is here in writing and simply answer the question: what is more important in determining what is likely to occur for the rest of the season?  April Record or April Run Differential.

I took team performance from 2014-2016.  I ran linear modeling for April winning percentage against the rest of the season winning percentage.  Similarly, I ran run differential (as a rate for 20 games) against the rest of season winning percentage.

Utility as a predictor:
April Wins and Losses - r2 = 0.07
April Run Differential - r2 = 0.21

Both are rather poor predictors of future performance, but in practice a difference would be made.  If you were running a population study and saw a correlation of 0.21, then you would likely highlight that as a potential variable that plays a role in whatever you are looking at.  A variable with a 0.07 coefficient is one that will almost always be tossed out as having little to no predictive value.

OK, but what about very successful April teams?  How do the metrics work for them (n=18)?
April Wins and Losses - r2 = 0.03
April Run Differential - r2 = 0.30

This also seems to affirm that April Run Differential is more meaningful, but it may be over or under stating things due to a much smaller sample size.

But what about a team like that Yankees with a plus 40 run differential and a team like the Orioles with a neutral run differential?  Do teams like that act differently?

The Orioles are performing at a .147 winning percentage above their Pythagorean win expectation. 

For teams performing .070 and above (n=11):
April Wins and Losses - r2 = 0.15
April Run Differential - r2 = 0.24

For teams underperforming between -0.048 and -0.058 (n=11):
April Wins and Losses - r2 = 0.45
April Run Differential - r2 = 0.45

Amazingly, it looks like these metrics work much better for teams in the Yankees' position.  For the Orioles, Run Differential still looks like the preferred method, but Wins and Losses became more interesting.  That all said, I would prefer to use the correlation values from the whole 90 team data set than these sub-divisions because there may be an issue with small sample size.

Regardless, for the Orioles, there does not appear to be a great reason to ignore run differential or prefer team record over run differential for expecting how the team will perform in the future.  Instead, one should probably look to the impact of Chris Tillman and Zach Britton returning, but, of course, that means to ignore the probability of other injuries to key players occurring.


Roger said...

I think you might find a better correlation in measuring how teams do on cross-country road trips. I'm sure the O's eventually have a disaster of a West Coast trip coming soon. Also, there was a lot made here about the O's needing to do well in April against division opponents. The fact that they have done very well would seem to portend a better season overall.

Jon Shepherd said...

A lot to do with divisional opponents here? I don't think you remember my article as clearly as you might think you do. 29% more than a non division game.

Anonymous said...

Another useless stat, don't sweat it!

Roger said...

Jon, I don't understand your comment. I was using your article to say that the O's should be in good shape because they've done well in games that are 29% more important. I was adding to that the fact that teams have trouble with large time zone shifts - the O's record last year against West Coast opponents, of which they've had none so far, is another hurdle that the O's and any other coastal team will share. A fact that I haven't seen brought up here at the Depot in ny statistical way.

Jon Shepherd said...

You wrote that a lot was made of the importance of divisional games. A lot was not made. That overstates the weight of that inequality.

Jon Shepherd said...

Regarding West Coast games...I think and could be wrong that no East club has ventured out West. I could be wrong though.

Jon Shepherd said...

I am also thinking that no West team has visited east.