11 September 2012

Stephen Strasburg and the Verducci Effect

I sometimes think some ideas are firmly illustrated to be wrong and then become surprised when I see that there are back wellings where the idea still holds.  One of these ideas is the Verducci Effect.  The Verducci Effect specifically is about how with pitchers 25 years old and younger who throw more than 30 IP in comparison to their previous year pitcher will be at a greater risk for injury.  The term is often broadened to mean really anything to do with giving a young pitcher too much work before he gets older.  The idea had a lot of traction at first as it made some logical sense.  Young arms are growing arms and young arms are less experienced arms.  Like anyone at the gym or running, you slowly build up your strength or stamina over time with increasingly great feats.  This makes sense to us.

However, time and time again, the Verducci Effect has been shown to not be real at all even though he delivers a column or two on it every year (though now it looks like he is transitioning over to injured closer stories).  The earliest study I can find is from David Gassko in 2006 that found "overworked" pitchers appeared to pitch more, not less, innings the next year.  Jeremy Greenhouse wrote a column on injuries and the Effect...once again finding nothing to the notion.  I am sure there are many, many other articles by amazing writers who went on to be employed by Major League Baseball franchises.

That said...the Nationals claim they have evidence that shows in their favor the need to end Stephen Strasburg's season even though he is one of their best arms and they will be entering the playoffs.  From my qualitative perspective, it has seemed that this application of the Effect pleases Tom Verducci.  I figured to give the idea another look and measure the general idea in a slightly different (but incredibly simple) way.  Are the Nats using data over several years, not just one?  They have control of Strasburg's rights in 2013, 2014, and 2015.  In a marathon sense, keeping him healthy over that period is more important than the result of four games in September and, arguably, a couple games in October.  In a Keep-Your-Eyes-on-the-Prize sense, well, he should be pitching.

To test this, I took every pitcher from 1998 to 2007 (a ten year period) who threw more than 140 IP t age 23 and within his first three years of pitching at the MLB level.  I then proceeded to sub-divide these players into 20 IP allotments.  For instance, 140-159, 160-179, 180-199, 200-219, and 220+.  I then individually compared their current season to the accumulation of their next three seasons.  I compared how many innings they pitched as well as creating a metric for this study, vFIP.  To measure vFIP, you divide a pitcher's age 23 FIP by the next three year accumulative FIP.  A vFIP over 100 shows improvement and vice versa.

The first thing to look at would be injuries.  Half of the ten pitchers in the 140-159 class suffered injuries over the next five years (Ricky Nolasco, Josh Beckett, Roy Oswalt, Daniel Cabrera, and Ken Cloude).  This sounds like a great deal of loss, but every innings group roughly had the same injury effect rate, which would agree with previous studies. 

Perhaps better would be to compare actual work loads over the age 24 to 26 seasons.  In terms of innings pitched, there was no significant difference between the 140-159 group and the 160-179 (p=0.78) and 180-199 (p=0.66) groups.  However, significant differences were found between that group and pitchers who threw more than 200 innings (0.03 and 0.04, respectively).

140 160 180 200 220
135 128.1 147.1 191.1 194
There is likely to be a selection bias in play here as if a pitcher was given the opportunity to throw 200 innings in a season then he is likely to be a very good pitcher (or considered to be one) and earn or be given the opportunity to pitch a great deal over the next three seasons.  At the very least, it appears that pitching fewer than 200 innings changes what will happen much at all.

The final aspect to look at is performance.  This is where I will break out the vFIP metric and here is the data set:

140 160 180 200 220
123 118 104 100 108
111 83 103 83 92
87 102 93 113 87
76 104 100 96 122
131 82 98 106 65
120 106 124 83 98
99 90 98 92
130 101 86 114
96 85 94 105
96 129 80

116 106 111



The groups are not significantly different from each other.  However, there appears to be a slight improvement in performance from the 140-159 group.  Again, this is not significant and likely requires a larger data set to see if this trend can be more firmly established, but the 140-159 pitchers improved their performance by 16% as a group.  The other four groups were consistent with improvement ranged from 1 to 3 % better than their age 23 seasons. 

Again, there may be some selection bias in these groupings because if you are tossing over 200 innings then you have probably pitched very well and it will be difficult to improve upon that.  Here are the raw FIPs for the groups.

140 160 180 200 220
Age 23 4.64 4.73 4.4 4.24 4.18
24-26 4.01 4.65 4.28 4.17 4.16
The data suggests that the 140-159 group pitchers saw great improvement in their performances.  However, I would temper those differences with the idea that perhaps a pitcher who throws 140-159 innings at age 23 and proceeds to do poorly is more likely to be replaced in the rotation than a pitcher who tosses more innings.  There may be a prejudice that benefits pitchers who threw more innings during their age 23 season.

That all said, I am not sure how this informs us about Stephen Strasburg.  There is no evidence from the above methodology that injury rates decrease.  It appears that pitchers who log in more innings during their age 23 season wind up throwing more innings in the future at about the same level of performance.  Pitchers who are worked for 140-159 MLB innings tend to show improvement as a group in terms of performance, but not in innings pitched.  This may be the result of less desirable pitchers being able to be discarded more easily when they have less of a track record.

09 September 2012

Did the Orioles Get knocked out of the Playoffs Last Night?

Last night the Orioles took down the Yankees 5-4 to enter into a first place tie for the AL East.  With Tampa's loss, it leaves both teams two up in the Division, tied with Oakland for the first wild card, and two up on Oakland for the second wild card.  The Orioles are in a great place and, for the most part, control their destiny going forward.  However, the Evil Empire's CC Sabathia threw inside to Nick Markakis, breaking his thumb, and worrying many that such a loss may result in the Orioles missing the playoffs.

It made me wonder that with 23 games remaining, how many runs will his absence cost the Orioles with his backups being Lew Ford and Xavier Avery.  In doing so, I took Ford's and Avery runs above replacement (RAR) for offense and runs above average (RAA) for defense to determine how valuable they will be over the remainder of the season.  You may wonder why one is replacement level and the other is against average.  Both prominent metrics for evaluating a player holistically (b/rWAR and fWAR) compare players to a replacement level player (an abstract someone who should be available at AAA) who is defined as having a lower aptitude for hitting and an average aptitude for fielding.

Additionally, I compared those numbers to what Nick would be projected to do based on his season numbers.  I also compared them to what Nick has done over the past 30 days (27 games).  This should give us an idea how many runs losing Nick will cost the team.

oRAR dRAA Proj. Difference
Markakis Season 23 -9 3
Markakis last 30 6 -2 3
-2 1 -1 -4
1 -2 -1 -4
Another assumption to be made here is that the defensive numbers are transferable from left field to right field for each player.  Avery's arm (his major weakness) is adequate in left, but will be greatly tested in right field.  Ford's arm is average in left field, but his range is below average.  In other words, we should probably expect Ford to perform about the same and Avery worse than what Markakis is currently doing out there.  If you use that qualitative approach, then defensive value should decrease about three runs leaving both players at seven runs in the whole compared to Markakis.

An additional assumption is that these run values over the course of 17 games for Ford and 25 games for Avery are appropriate sample sizes to project performance forward.  We know it takes about two years to get a decent evaluation of defense, so those numbers are probably rather useless.  The offensive sample size is also pretty slight.  It should be noted though that the overall expectation of their performance is replacement level.  They deliver that performance in left field.  The question is whether they can do that in right field.

This year's run environment put a win to every 9.285 runs produced.  With that in mind, the range we have in lost performance (-4 to -7 runs) converts to -0.43 to -0.75 wins.  This is not entirely convenient with the Orioles embroiled in such a tight race.  An option could be for the team to try to improve themselves through a trade, but that player would not be eligible for the playoffs.  In other words, it would be a costly trade, but perhaps a cost that might be needed to be made.  However, there really are no good right fielders playing for teams out of the running on the market who will be free agents after the season.

So much depends on Lew Ford and Xavier Avery, but it probably does not depend as much as it feels.

08 September 2012

Pitchers Probably Should Wear Helmets, Too

Easton-Bell's Pitcher Helment
I have never really been a proponent of pitchers wearing helmets.  I have never given the idea much thought and it reminds me of how silly guys like John Olerud looked wearing a helmet.  Olerud suffered a brain aneurysm in college, so it medically made sense.  Otherwise, the practice seems like overprotection.  It also harkens back to Steve Tasker, who wore an extra large helmet.  That seemed silly to be as well, but with the increase of information about potential brain injuries in football...it makes much more sense to better protect the head.

That change of thought about football...keeping an open mind as new data or experiences challenge my perspective makes me more open to the idea that baseball can be dangerous.  Base coaches wearing helmets had also seemed silly.  Then someone had to die for people to take those risks seriously.  That is how our society operates.  Our rules and regulations require blood before action is taken.  A textile plant needs to burn down before sweatshop conditions were dealt with.  One out of every five tunnel workers have to die from silicosis before that is handled.  The Cuyahoga had to burn 13 times before people thought maybe that we should take better care of our drinking water supplies.  And now, we wait for global warming (aka climate change) to somehow show us more explicitly that bad things are happening.  For many shrinkage of ice coverage, reduction in coral, movement of planting zones, pH changes in high latitude seas, increased understanding of how carbon dioxide fits into global ecological/geological cycles.  It is simply how we are.  We see immediate things that are right in front of us that are hard to ignore.  I mean, how long did people ignore cigarettes or leaded gasoline?

This roundabout thought process leads us to pitcher head injuries in baseball.  A pitcher gets hurt at each level every single year.  This year, the notable bull's eye shot happened to Brandon McCarthy.  Just a quick internet search brought up these incidents:
  • In 2002, Kaz Ishii suffered a skull fracture.
  • In 2005, Kyle Denney at AAA suffered a skull fracture.
  • In 2009, Darin Downs at AAA suffered a skull fracture.
  • In 2010, a high school pitcher was placed in a medically induced coma for several weeks due to brain swelling from a skull fracture.
  • In 2012, a high school pitcher suffered a severe skull fracture.
So, it happens and it is a dangerous event to occur.  It is relatively rare.  In the professional ranks maybe a handful of headshots happen each year and only a couple at most result in a serious fracture event.  Similarly, batters were plunked on occasion before helmets were issued with some serious beanings, but it took Ray Chapman to die for anyone to consider doing anything about it and it took a long time for them to actively do something about it.

The positives are clear...in the incredibly unlikely situation a pitcher is unable to defend himself on the mound and is hit in the head with a ball, he is substantially less likely to suffer a severe injury or death.  The negatives are...well, that the pitcher looks silly.  Pitchers simply get hit in the head and get injured.  Something should be done to protect them.  It makes sense, it is cheap to do, it likely has no effect on performance, and the only true cost is people getting over it looking silly.

05 September 2012

Expanded Roster: Why the Orioles are Possibly Better Than Their Season Run Differential

When the Orioles expand their roster, so do we.  Click here to find all of Camden Depot's Expanded Roster entries for 2012.  2011 Expanded Roster items can be found here.  As always, feel free to provide the Depot with suggestions for posts or with your own interest in writing an items or several to be posted here.  

note: This post was filed before last night's 12-0 drubbing of the Blue Jays.  However, I think what this post is getting at is still quite applicable.

Why the Orioles are possibly better than their season run differential
by Matt Perez 
One of the common claims about the Orioles this season is that they are defying their peripherals. A common metric to determine the quality of a team uses their run differential. When a team has been outscored, they are expected to win fewer than half of their games. When one looks at the Orioles run differential, we see that despite being outscored the Orioles have won considerably more than half of their games and have won eleven games more than expected as of September 4th. One could look at this and decide that the Orioles have simply been lucky so far. Others claim that the Orioles have outplayed their run differential for a variety of reasons such as because they've played a lot of close games and have lost many blowouts.
Suppose that a random team plays a three game series. They lose the first game 16-2 and win the second two games 2-0. According to how run differential is used to determine a team’s performance, one would expect this team to be swept. This example illustrates the perils of run differential because using this method would cause one too inaccurately give too much weight to the first game at the expense of the other two games. If the Orioles run differential is skewed because they've lost many blowouts and won many close games, then run differential would not give us an accurate result of the Orioles performance because it would be giving too much weight to the blowout losses and too little weight to the close wins.
I decided to test this hypothesis, I first used data provided free of charge from and copyrighted by Retrosheet (interested parties may contact Retrosheet at "www.retrosheet.org") that had among other stats the scores of every game played from 2000 to 2011, the date the game was played, and the teams playing. With this data, I am able to determine the historical accuracy of any models that I build. If something has been true for that period of time, it seems plausible that it would be accurate for the current season.
Using this data I was able to create what I call a per-game Pythagorean win percentage. Instead of using the run differential per season, I determined the run differential per game for each team. Doing this ensures that regardless of the amount of runs scored in a game, each game still has the same value. This method ensures that a team that won many low scoring games while losing many high scoring games would get proper credit. After I determined the value for each game, I took the mean for each team and season to get the per game Pythagorean winning percentage. For games in which no runs were scored, a team was given a zero percent chance of winning.
Once I have historical data for the metric, it is necessary to ensure that the metric effectively measures actual win percentage and that there is a significant correlation. If it is less precise than per season run differential or if there's no correlation between this stat and actual win percentage, than this proposed metric would tell me little about the Orioles past performance. Using data from the 2000 to 2011 seasons, I did a correlation analysis between actual win percent, Pythagorean win percent and per-game Pythagorean win percent. I discovered that while Pythagorean win percent has a correlation of .93765 to actual win percentage that Pythagorean win percent has a correlation of .97645 to actual win percentage. This is a very strong correlation and indicates that per game Pythagorean win percent is a more precise metric than per season Pythagorean win percent.
When I determined the Orioles per-game Pythagorean win percent, I discovered that they should be expected to win 69(69.2) games. While this doesn't fully explain the Orioles performance, it does indicate that the Orioles are better than per season win differential winning percentage suggests and that losses in high scoring games are making them look worse than they are. If this is in fact the case, then the Orioles should be favored to hold onto the second wild card and make it to the playoffs.
At the very least the last series of the Orioles regular season is against Tampa. I would expect a playoff berth to be at stake.

04 September 2012

Indications of the Orioles Winning Over 90 Games

People tend to have a tendency to repeat some variation of Bill Parcels epic quote: you are what your record says you are.  I think more accurately it is that you were what your record says you were.  Wins and losses are one way to measure the talent of a team.  They are certainly the most definitive way to describe what happened, but for various reasons (i.e., non-skill based succession of events) it is not a great predictor for the future.  Though it is not bad.

In a previous post about six weeks ago, I simply illustrated that first half run differential was a slightly better indicator of second half record than it was to simply look at the first half record.  The primary issues of using either are basically that an inflated record can result in a team trying to play out their good hand by adding complementary pieces, increasing their talent level.  First half run differential can also be misleading because a good differential combined with an underperforming record could result in a team selling off their assets.

As we enter September, something new could be happening.  Many teams, such as one like the Twins or Astros, see this time as a chance to look at their younger players and get a read on them against MLB competition.  These teams still want to win, but they often put out a less than their best lineup.  As a result, September can be a pretty peculiar month.

For today, I will be assuming that a team's record to date is an accurate representation of their performance.  I will be trying to determine how many more wins the Orioles will add to their current 75.  To do this, I used Bill James' log5 method.  We get the following table:

Games Left Win Pct. ExWins
Blue Jays 6 0.448 3.7
Rays 6 0.548 3.1
Red Sox 6 0.456 3.6
Yankees 4 0.567 2
A's 3 0.567 1.5
Mariners 3 0.485 1.7
Log5 puts the Orioles at a 15.6-12.4 record for the remainder of the season.  This put the expectation for the team to finish with 90 (91 if you round up) wins.  Last year, Boston would have been the second wild card and they had 90 wins.  In other words, Baltimore is in a strong position.  Within the AL East, the Yankees have largely the same slate with divisional opponents along with the A's and Twins as their out of division series.  The Rays have it a bit harder as their out of division foes include the Rangers and White Sox which are both playoff contending teams.

It should be an exciting month.

03 September 2012

Expanded Roster: Optimism for Nick Markakis?

When the Orioles expand their roster, so do we.  Click here to find all of Camden Depot's Expanded Roster entries for 2012.  2011 Expanded Roster items can be found here.  As always, feel free to provide the Depot with suggestions for posts or with your own interest in writing an items or several to be posted here.  

Orioles Magic: Optimism for Nick Markakis?
by Albert Lang

Arbitrary end points, selection bias to suit a narrative, et cetera…whatever. See below:

Carl Yastrzemski side-by-side with Nick Markakis through age 26 seasons

*Above, and all subsequent charts, pulled from Fangraphs

At one point in time, a reasonable man could argue that Nick Markakis was on a Hall of Fame trajectory and that his career to date looked incredibly similar to one of the best to ever play the game: Yaz.

That said, certainly expecting any sort of linear or similar development from ballplayers, especially young ones, is faulty. However, it’s human nature to assume a commonality among players with this many plate appearances at such a tender age, yet the difference between Carl Yastrzemski’s age 27 season and Nick Markakis’s could not be wider.

In 1967, Yaz hit 44 HRs and batted .326/.418/.622. Yaz won the Triple Crown and led the league in total bases, runs, hits, OBP and slugging percentage.

In 2011, Nick Markakis hit 15 HRs and batted .284/.351/.406. He posted a worse slugging percentage than Yaz did OBP (wrong denominators, I know).

Unfortunately, given Yaz played a long time ago, it’s hard to see how/why he succeeded while Markakis “failed.” Yaz was a good player through his age 26 season, but he had only averaged 16 HRs a year – a far cry from the 44 he would hit as a 27-year-old. The following season (1968), Yaz hit just 23, before posting back-to-back 40 HR seasons. Following his last 40 HR season, though, Yaz averaged 18 HRs from 1971-1979.

In Yaz’s first six years, he had one 20 HR season; Markakis had two 20 HR seasons in his first three years. It wasn’t as if Yaz was hitting more doubles, either, as he collected just 21 more through age 26 than Markakis did.

Even if you were a pessimistic Orioles fan (like me), you could see a glimmer of hope leading into 2011 with Markakis, especially when looking at his similarities with Yaz. Heck, even if Markakis was half the player at 27 that Yaz was, he’d be a 6 fWAR guy.

That said, a baseball observer without much skin in the game or need for Orioles hope (magic) would have simply cited the decline of Markakis as reason why he’d come nowhere near duplicating Yaz’s tremendous season. There is no denying that Markakis had a three year decline in ISO and went from 12% (2008) to 8% (2009) to 6.1% (2010) HR/FB rates. In addition, after hitting 23, 20 and 18 HRs from 2007-2009, Markakis hit 12 in 2010.

However, Yaz had similar up and down power numbers: after hitting 19 HRs in 1962, Yaz hit 14 and 15 the following years – not quite a rock bottom 12, but still eons away from 40+.

Yaz did smack 20 dingers in 1965, but followed that up with just 16 in 1966. Basically, aside from 1965, his ISO was pretty much perched around .155, i.e., before 1966 it would have been hard to argue that Yaz was a budding slugging juggernaut cresting toward MVP world-beater status.

So, if there wasn’t much difference between their HRs, doubles or bulk power (ISO), how did Yaz succeed while Markakis “failed?”

Looking at the small chart above, the biggest difference in the two players occurs in walk and strike-out rates.

Red Sox
Red Sox
Red Sox
Red Sox
Red Sox
Red Sox

As Yaz matured, he walked more and struck out less. While, in 2008, Markakis showed exceptional walk skills (as Yaz did in his third year), Markakis was unable to maintain anything close to that approach. Furthermore, Markakis paired his walk rate with a much higher K%, while Yaz walked more and struck out less. Yaz took a step back after his third season, with his K% > BB%, but it’d be the last time he struck out at higher rates than he walked. Meanwhile, Markakis nearly halved his walk rate after his break-out plate discipline season, coming nowhere near maintaining that level of success.

Quite simply, while the power, doubles or batted ball rates weren’t all that different, it appears their approach was. Yaz quickly became a walk machine, who didn’t strike out. Markakis didn’t.

That said, there is some underlying optimism with Markakis, albeit in some small sample size mumbo-jumbo.

In just 304 plate appearances this season, he is posting his lowest K%. He’s also posting a tiny 3.9% swinging strike rate (the average is 9% this year and has been around 8.5% since 2006). In addition, his 12.5% HR/FB rate is nearly identical to his 2008 year, when he was a 6.3 fWAR player (i.e., half Yaz) (although a lot of value in that comes from defense). Lastly, his ISO (.173) would be the third best of his career and the first time he bested .160 since 2008.

Will Markakis hit 40 HRs in 2013? Doubtful. But, don’t be shocked with 30. Heck, he could hit 16 this year in 530 plate appearances (ZIPs projections). It’d be a cagey bet to put your money behind Markakis with 20+ HRs in 2013.

So, it seems there might be a bit of magic left in Markakis. But is there Hall of Fame magic?

Through AGE 28 Seasons


This is a pretty interesting cluster of players. Winfield is the only Hall of Famer here, but legitimate (although failing arguments) can be made for Reggie Smith and Harold Baines and, to a lesser extent, Jack Clark being enshrined.

Can Nick Markakis become the next Dave Winfield? He is ahead of Winfield’s hits pace, and this takes into account that Markakis doesn’t have a full age 28 season. Of course, Markakis has nowhere near the speed or power that Winfield possessed. In fact, Markakis is overshadowed on this list when it comes to power by most players. Aside from Carney Lansford, Markakis has the lowest slugging percentage and ISO – and Lansford was an infielder.

That said, Markakis has the best OBP, owing to a decent walk rate and possessing the best batting average.

Production from Age 28 Season to the End of their Careers

After turning 28, this group averaged just over 200 HRs and 1,381 hits. That would give Markakis at least 314 HRs and 2,537 hits for his career. In addition, doing some quick math gets this group to a .283 batting average and .376 OBP. Each one of these players increased their BB% after their age 28 season. So there’s some optimism that Markakis can start walking more, i.e., what Yaz accomplished in the earlier stages of his career.

If Nick Markakis finished his career with a .292 average, .373 OBP, 314 HRs and 2,537 hits, Orioles fans should be pleased and not overly shocked. Even if he comes up a bit short on those numbers, he’ll have been the best player the Orioles developed since Mike Mussina.