Some of these new approaches required new technology. One such approach is Statcast, which uses multiple cameras to identify elements such as players, the baseball, and a bat. This is combined with radar data and the reward is a ton of data. While one can use this data to evaluate pitchers, baserunners, and fielders, we will be using it in this column to discern ability in hitters. Specifically, whether exit velocity of a batted ball can be related to the power metric, isolated power. And, then, if one or two seasons of exit velocity data can be used to accurately project future performance.
Now, the first step in figuring out how useful these measurements might be is to compare them in season. I only looked at player who had 300 plate appearances in both 2015 and 2016. What we find is that average hit distance (forgive the error in the graphic below, it is average hit distance not home run distance) and barrel rate correlate very strongly with isolated power in the same year (p < 0.01 for both variables). This means that these two ways to measure velocity and contact quality are related in-season to isolated power. The regression model connecting those measurements to the metric isolated power was also significant (<0 .01="" p="">
Below is a graph comparing expected ISO with ISO for 2015 with the accompanying R2 value:
So all of this informs us that hit quality is connected to isolated power. That should be obvious, but it is helpful to be able to see that here. However, what we are really interested in is whether these values are meaningful from one year to the next. In other words, is this simply a descriptive correlation or is it a predictive correlation.
We will do a very simple comparison. We will simply compare R2 values for expected ISO using the 2015 developed model vs. 2016's ISO. This simple comparison will help show whether the formula using Statcast measurements better correlates with next season's values than simply using the actual ISO from the year before. The comparison between 2015 ISO and 2016 ISO is not shown, but the R2 was 0.5026
What we find is that the Statcast method improves the predictive capability by about 15%. That is remarkable, but is not earth shattering. If your decision making process was simply finding the players with strong ISO, then this technique would help but it might take a decade or so for that to be able to be seen through the noise. In general, I do not find this to be much of a silver bullet. That said, it may be a hesitant flag for some players and suggest some players that should be expected to regress downward or upward.
Here is a list of players who the model thinks most underperformed. In other words, who does this model think should have had a bigger 2016 than they actually did.
One name that jumped out to me was Kendrys Morales. He had a solid year last year, but the model thinks it should have been considerably better. If the model better accounts for his talent, then we might see something closer to that expected isolated power. It may well be that playing in Kansas City depressed his value a bit and some of his hard hit balls should have fallen in. A different point of view would be that perhaps his isolated power was depressed because he is below average in converting singles into doubles. That might explain why Pujols is up here as well.
Here is a list of players who the model thinks most overperformed:
I would have thought that the model would list speedster after speedster, guys who stretch singles into doubles. That does not appear to be the case here. Many of these players are rather plodding. The closest Oriole on this list is Jonathan Schoop who comes in at a -.024, which is not a good thing to hear given how uneven and somewhat underwhelming his season was last year.
This made me wonder though about how things change over time. For instance, is the over or under production from batted ball performance to expected batted ball performance a skill. Are over producers always over producers. What was remarkable was that the average difference between 2016's difference and 2014's difference was .011. The greatest difference was .046. This suggests that there is some element that I am missing. The ability to over or under produce appears to be repeatable, so therefore likely having to do with a skill. The next step is finding that skill.0>