13 November 2012

Measuring Bullpen Management: Bucking Up the Bullpen

A common thought among many of us is that much of the Orioles' success in 2012 was largely dependent on the bullpen.  An fWAR breakdown shows that the bullpen (6.4 fWAR, 5th) was better ranking-wise than either batting (15.3 fWAR, 25th) or starting pitching (10.2 fWAR, 19th).  The bullpen was largely constructed with existing pieces, half of the return for the Guthrie trade, and a small late winter free agent signing.  A more thorough discussion of the various pieces of the bullpen can be found in an article Jon Bernhardt wrote in August.  This post however will take a look at whether bullpen performance can be attributed to a manager.

Generally, conventional wisdom has it that individual relief pitchers are a widely varying lot in terms of performance.  These pitchers put up about 50 innings or so each year, which really is not enough time on the mound to statistically inform an assessment of future performance.  However, it may well be that a manager (or perhaps a manager with his General Manager) may actually be able to create a high performing bullpen on a relatively consistent basis.

A Hardball Times article suggested using WPA - WPA/LI as a measure of bullpen performance that would be useful in assessing bullpen management.  WPA is the acronym for Wins Probability Added.  This is calculated as the difference in win expectancy before and after an event. LI is the Leverage Index.  It is a measurement of how consequential a specific scenario is based on the inning, outs, score, baserunners, and baserunner position.  By using the two statistics in concert, you arguably have a measure that gives you a context neutral wins added metric.

To test whether this metric would give an indication that managers have skill associated with using a bullpen well, I decided to take the last five seasons of several recent managers.  I compare each manager's mean for WPA, WPA/LI, and WPA-WPA/LI (ANOVA).  WPA will inform us simply if certain teams wind up with better outcomes for winning based on reliever usage.  WPA/LI will neutralize all situations and measure reliever ability in general (similar to wOBA).  WPA-WPA/LI is being used to inform us about whether relievers actually perform better in clutch situations.  If one of these statistics indicate an actual skill then the numbers associated with a manager should be (1) repeatable and (2) result in managers being different from each other.  Of course, this assumes that these things are, in fact, measurable.

Bob Melvin 3.75 0.83 2.92
Bruce Bochy 2.98 0.13 2.84
Joe Giraldi 7.37 4.88 2.49
Buck Showalter 5.61 3.67 1.94
Terry Francona 5.56 3.69 1.88
Mike Scioscia 1.94 0.13 1.81
Ozzie Guillen 1.86 0.31 1.54
Joe Madden 4.42 3.21 1.22
Ron Gardenhire 2.78 1.59 1.19
Charlie Manuel 2.60 1.49 1.11
Ron Washington 4.35 3.25 1.10
Bud Black 1.74 0.80 0.94
Dusty Baker 2.76 2.19 0.56
Joe Torre 2.81 2.28 0.53
Jim Leyland 1.16 0.90 0.26
Ned Yost 1.00 1.29 -0.29
Tony LaRussa 1.28 1.74 -0.46
Eric Wedge -0.80 -0.15 -0.66
Bobby Cox 0.31 1.63 -1.31
Interestingly enough, both WPA (p=0.12) and WPA - WPA/LI (p=0.45) were not found to be significant in this study using this data.  However, WPA/LI was found to have significant differences within the population (p < 0.05).

So who typically has a good bullpen based on WPA/LI?

Joe Giraldi 123~5~~~~~~
Terry Francona 12345678~~~
Buck Showalter 1234567~~~~
Ron Washington ~234567~~~~
Joe Madden 123456789~~
Joe Torre ~234567890~
Dusty Baker ~234567890X
Tony LaRussa ~234567890X
Bobby Cox ~234567890X
Ron Gardenhire ~234567890X
Charlie Manuel ~2~~567890X
Ned Yost ~234567890X
Jim Leyland ~~~~567890X
Bob Melvin ~234567890X
Bud Black ~234567890X
Ozzie Guillen ~~~~~67890X
Bruce Bochy ~~~~567890X
Mike Scioscia ~~~~~~7890X
Eric Wedge ~~~~~67890X
The above table shows groups of similar performance.  The bold numbers indicate how managers differ.  Most managers belong to their own specific number, but 7 (Baker, LaRussa, Cox, Gardenhire, Yost, Melvin, and Black), 9 (Leyland and Bochy), and 0 (Guillen and Wedge) are shared.  For instance all 1s are not significantly different from Joe Giraldi and Joe Giraldi is significantly different from groups 4 and 6 through X.  Eye-balling it there are essentially three tiers:
Tier 1
Giraldi, Francona, Showalter, Washington, and Madden
All five of these managers get consistently good performance out of their bullpens and, as a whole, have significantly better performance than groups 9 through X.  Joe Giralid's pens have produced very well making his group significantly better than all of the others except 1, 2, 3, and 5.  This actually provides a potential argument for leaving Washington out of this tier.  However, the relationships of his performance with the other managers' performances appear to fit best in group 1. Only Showalter had multiple GMs in this grouping.
Tier 2
Torre, Baker, LaRussa, Cox, Gardenhire, Yost, Melvin, and Black
This group appears no different from anyone other than Joe Giraldi's bullpen.  Torre, Yost, and Melvin account for the managers for more than one GM.
Tier 3
Manuel, Leyland, Guillen, Bochy, Scioscia, and Wedge
The bullpen performance out of this group are notable in that they are not included in some of the upper performing groups.  In fact, their performances are significantly worse than everyone in Tier 1 except for Francona and Madden.  Only Wedge and Guillen were managers under different GMs.

The above study shows that good bullpen performance when measured as context neutral does tend to relate to specific managers.  However, it is unclear exactly what is resulting in these managers having good bullpens.  It does not appear that certain managers are adept at putting their pitchers in certain scenarios to get the most out of them, but that may simply be a limitation in how I tried to measure bullpen management.

