03 July 2014

Revisiting the Inside Edge Data

Earlier this year Fangraphs began sharing some Inside Edge fielding data with the public and at that time wrote this article. I didn't think much about it since then so I was a bit surprised to see some interesting Inside Edge data when doing research for my previous post discussing Manny Machado. The table below shows Manny Machado's Inside Edge fielding data results for 2012, 2013 and 2014. The data in this table and all other tables are accurate as of June 16th 2014.

Manny Machado has played 1000 fewer innings in 2014 than he did in 2013 but has seen more impossible plays this year than last year and only one fewer remote play. This made me wonder whether this is a league-wide trend or something unique only to Manny.

What I did was download all the Inside Edge data from Fangraphs from 2012 to 2014 and determined the sum number of plays made in each category by position and year to see whether there are any differences. In order to insure that I was comparing apples to apples I included two rows for 2014. The first row has the current 2014 data while the second row projects the full season data provided that there are 43,500 innings in the 2014 season. Since some games go into extra innings and others are short due to range each season has a different amount of innings played.

The data is somewhat unwieldy. In order to make things simpler it makes sense to start with the results divided into infielders and outfielders by year.

Infielders have already had more impossible chances in 2014 then they did in 2012 or 2013. At the current pace there will be three times as many impossible plays in 2014 then there were in 2012 or 2013. Infielders are on pace to also have more remote and unlikely plays in 2014 than in either 2012 or 2013. Infielders are on pace to have only 2000 more chances in 2014 than in either 2012 or 2013.

Outfielders are on pace to have far fewer impossible chances in 2014 than they did in 2012 or 2013. They are on pace to have more remote, unlikely, even and likely chances and fewer certain chances than they did in 2012 or 2013. Outfielders should be expected to have 4,000 fewer chances in 2014 than they did in 2012 or 2013. This indicates that infielders are fielding more plays at the expense of outfielders.

The results from 2012 and 2013 are relatively similar to each other. This makes it all the more curious that the 2014 results look so different.

Here’s how the years stack up without considering position.

There are roughly the same amount of total plays from 2012 to 2014. There are far fewer impossible and certain plays in 2014 then there were in 2012/2013 and more remote and unlikely plays in 2014 then there were in 2012/2013. These trends are consistent by year and position. If someone wants to see the chart for each position and year they can access it here.

One reason that may explain these trends could be conversion rates. If fielders are successfully fielding more remote chances then that could indicate that more tough remote chances are being considered impossible while more tough unlikely chances are being considered remote. This could potentially explain why impossible and remote chances have seen an increase. Likewise a decrease in certain chances could be explained by an increase in certain chance conversion.

Here’s a chart by position and year:

These results do not back up my theory. They indicate that fewer remote chances have been successfully fielded by infielders and that there certainly isn’t an increase in the success rate for outfielders. Players are successfully converting the same amount of certain chances in 2014 as they did in 2012 and 2013.

Fangraphs claims that a remote chance should be converted between 1-10% of the time, an unlikely chance should be converted 11-40% of the time, an even chance should be converted 41-60% of the time, an likely chance should be converted 61-90% of the time and a certain chance should be converted 91-100% of the time.

Given that this is the case it is curious to see that at many infield positions there were entire seasons where even chances were converted more than 60% of the time. Centerfielders consistently convert between 13-15% of all remote chances. Pitchers also convert between 10-20% of remote chances as well as 60-70% of all even chances. This suggests that there is bias or possible error in the dataset. It could also suggest that fielders are getting better over time and that difficult plays are being converted more routinely.

If we look at the conversion data via infield/outfield and years then it looks like this.

In 2013, outfielders converted more than 10% of all remote chances.

Infielders are converting considerably fewer remote chances in 2014 than they did in 2012/2013 while converting more even chances. There is no difference in the conversion rates for outfielders.

I admit to being perplexed by these results. The 2014 data seems to have considerable differences from the 2012 and 2013 data. It appears that the data for Manny Machado is not an aberration. It is perhaps possible that as the season progresses there will be more certain chances, fewer impossible and remote chances for infielders and more impossible chances for outfielders.

I wonder if Inside Edge hasn’t changed its guidelines to make infielders responsible for more of the outfield. I would expect many balls hit between the infield and outfield to be impossible for either fielder to defend and could possibly explain why there are many more infield impossible chances while there are fewer outfield impossible chances. This is just speculation.

There also appears to be some bias in the ratings. It is relatively simple to use a computer program to determine the likelihood of an event happening. It is far more complicated to have a person give you an accurate percentage. One of the strengths of using a zone-based system to quantify defense is that it minimizes human estimation. A computer formula will return a consistent result each time while even the best trained human eyes may disagree about the difficulty of a play. The fact that there's bias may explain why UZR had a larger year-to-year correlation than the Inside Edge data.

It appears that there are differences between the 2012/2013 and 2014 data. Until they are resolved it may not make sense to compare historical Inside Edge results to current Inside Edge results.


Ryan Solonche said...

Interesting numbers and certainly a weird spike for 2014. Could the increase in league-wide defensive shifts affect this data? Or, does a play go from "remote" to "impossible" due to a batter hitting against the shift, with the out still being recorded?

Matt Perez said...

I suppose it's possible that league-wide defensive shifts affect the data.

Inside Edge uses fielder positioning to determine the chances of making a defensive play. They would take the shift into account and if a hitter hit a ball that normally wouldn't be caught but is routine with the shift in effect then the play would be routine and vice versa.

That's a reasonable hypothesis but I wouldn't know how to create a test for it.

Berdj J. Rassam said...

The Orioles are off to a good start but need to make improvements - they are middle of the pack offensively, but bottom third in pitching - problems exist with both starting pitching as well as relief.