note: This post was filed before last night's 12-0 drubbing of the Blue Jays. However, I think what this post is getting at is still quite applicable.
Why the Orioles are possibly better than their season run differential
by Matt Perez
One of the common claims about the Orioles this season is that they are defying their peripherals. A common metric to determine the quality of a team uses their run differential. When a team has been outscored, they are expected to win fewer than half of their games. When one looks at the Orioles run differential, we see that despite being outscored the Orioles have won considerably more than half of their games and have won eleven games more than expected as of September 4th. One could look at this and decide that the Orioles have simply been lucky so far. Others claim that the Orioles have outplayed their run differential for a variety of reasons such as because they've played a lot of close games and have lost many blowouts.
Suppose that a random team plays a three game series. They lose the first game 16-2 and win the second two games 2-0. According to how run differential is used to determine a team’s performance, one would expect this team to be swept. This example illustrates the perils of run differential because using this method would cause one too inaccurately give too much weight to the first game at the expense of the other two games. If the Orioles run differential is skewed because they've lost many blowouts and won many close games, then run differential would not give us an accurate result of the Orioles performance because it would be giving too much weight to the blowout losses and too little weight to the close wins.
I decided to test this hypothesis, I first used data provided free of charge from and copyrighted by Retrosheet (interested parties may contact Retrosheet at "www.retrosheet.org") that had among other stats the scores of every game played from 2000 to 2011, the date the game was played, and the teams playing. With this data, I am able to determine the historical accuracy of any models that I build. If something has been true for that period of time, it seems plausible that it would be accurate for the current season.
Using this data I was able to create what I call a per-game Pythagorean win percentage. Instead of using the run differential per season, I determined the run differential per game for each team. Doing this ensures that regardless of the amount of runs scored in a game, each game still has the same value. This method ensures that a team that won many low scoring games while losing many high scoring games would get proper credit. After I determined the value for each game, I took the mean for each team and season to get the per game Pythagorean winning percentage. For games in which no runs were scored, a team was given a zero percent chance of winning.
Once I have historical data for the metric, it is necessary to ensure that the metric effectively measures actual win percentage and that there is a significant correlation. If it is less precise than per season run differential or if there's no correlation between this stat and actual win percentage, than this proposed metric would tell me little about the Orioles past performance. Using data from the 2000 to 2011 seasons, I did a correlation analysis between actual win percent, Pythagorean win percent and per-game Pythagorean win percent. I discovered that while Pythagorean win percent has a correlation of .93765 to actual win percentage that Pythagorean win percent has a correlation of .97645 to actual win percentage. This is a very strong correlation and indicates that per game Pythagorean win percent is a more precise metric than per season Pythagorean win percent.
When I determined the Orioles per-game Pythagorean win percent, I discovered that they should be expected to win 69(69.2) games. While this doesn't fully explain the Orioles performance, it does indicate that the Orioles are better than per season win differential winning percentage suggests and that losses in high scoring games are making them look worse than they are. If this is in fact the case, then the Orioles should be favored to hold onto the second wild card and make it to the playoffs.
At the very least the last series of the Orioles regular season is against Tampa. I would expect a playoff berth to be at stake.