09 June 2016

CRAP: Collegiate Regression Analysis Projection (2016 Draft)

Live look at how CRAP data is processed.
Baseball is a game for the math obsessed whether you ponder on runs batted in or the spin rate of a slider.  In the past decade, the number of data points have exploded in the public forum.  That data are almost exclusively recorded from Major League Baseball games.  That has trickled slightly down to the Minor League sphere.  Amateur baseball though has simply raised itself up to where we were a couple decades ago with MLB.  We know the basic counting statistics.

Projecting performance with high resolution data can be troublesome.  Replace that with the low resolution collegiate data, the process becomes more problematic.  Combine that with twenty year old players whose distance between their current package and their final one can be rather vast, you have a difficult task in front of you.  This led me to experiment with minimal expectations to try to see if we can glean anything of use from a projection model for collegiate players.  The resulting model is named Collegiate Regression Analysis Projection (CRAP).

CRAP is a rather limited model with great uncertainty.  I used player, park, and team data from collegiate seasons 2012-2014 and paired it with professional data from 2013-2015.  Professional data was further translated so that everything was measured to the league average of the Carolina League (High A ball).  That is all this model knows.  It knows players who were drafted, signed, and played in a professional league the year after their career ended.  It does not know what happens beyond the following season.  All it knows is the current year and the next, trying to connect all of those performance data points for each individual player.

Some limitations include never considering players who were unable to play professionally for whatever reason.  There is also some limitation on conference data.  This system leans heavily on the ACC, Pac10, SEC, and the Big 12 or whatever they call themselves this minute.  I also have no knowledge of any player's defensive ability.  The model assumes they perform as an average defender at those positions.  Finally, there is no consideration of how a player performs with a wood bat.  I did a secondary adjustment for performance at the Cape and for Team USA.  That adjustment is a bit weak as we are getting pretty far down the rabbit hole from the original data.

Anyway, here is the collegiate position player board, which considered players in Baseball America's top 100 from a few weeks back (it has changed a bit).  If you want a player not mentioned here, let me know and I will try to post an addendum.

CRAP 2016 Draft Board (College Position Players)
Avg/OBP/Slg are 2017 HiA Projections
20-80 scale is unadjusted for wood
20-80+ is adjusted for wood

PLAYER  POS  AVG OBP SLG 20-80 20-80+
Kyle Lewis OF .290 .424 .473 61 69
Will Craig 3B .291 .413 .483 63 59
Zack Collins C .294 .446 .467 69 59
Heath Quinn OF .278 .376 .459 51 58
Sean Murphy C .273 .378 .414 51 54
Matt Thaiss C .295 .392 .428 54 54
Nick Senzel 3B .281 .374 .421 47 53
Corey Ray OF .276 .360 .425 46 51
Chris Okey C .270 .368 .419 51 48
Bryan Reynolds OF .268 .369 .429 48 45
Anfernee Grier OF .265 .349 .403 41 39
Nick Banks OF .255 .324 .391 35 35
Jake Fraley OF .261 .349 .367 34 33
Ryan Boldt OF .249 .310 .362 29 31
Stephen Wrenn OF .246 .314 .361 29 31
Bryson Brigman SS .257 .323 .343 30 30
Buddy Reed OF .233 .318 .354 31 30
Bobby Dalbec 3B .214 .292 .346 28 29
Errol Robinson SS .245 .309 .339 28 28

Below are the rankings for CRAP and BA's current rankings of college level position players.  I did not include some of the other players who have moved into the top 100 since I scraped the list a few weeks ago.

CRAP 
 
Ranking
BA
 
Kyle Lewis
1
1
Will Craig
2
8
Zack Collins
3
4
Heath Quinn
4
9
Sean Murphy
5
12
Matt Thaiss
6
5
Nick Senzel
7
2
Corey Ray
8
3
Chris Okey
9
11
Bryan Reynolds
10
6
Anfernee Grier
11
10
Nick Banks
12
16
Jake Fraley
13
13
Ryan Boldt
14
14
Stephen Wrenn
15
19
Bryson Brigman
16
15
Buddy Reed
17
7
Bobby Dalbec
18
17
Errol Robinson
19
18

As it stands, the Orioles are largely connected with Buddy Reed who stands high on the Baseball America rankings, but who is highly suspect on the CRAP rankings.

2 comments:

Roger said...

CRAP seems to have done a bit better than BA at predicting order of draft. Nice work. Wish you had a model for pitchers......

Jon Shepherd said...

Thanks.

I might have a pitcher model together by the end of the summer.