- Small sample size. Think about it, how many plate appearances do you need to feel comfortable about a player's performance? It is probably somewhere in the 500-600 range where you begin to have a decent idea as to how good a hitter is. Wieters has seen 916 first pitches in his career. He has seen 424, 397, and 385 pitches in 0-1, 1-0, and 1-1 counts. Every other situation is less than 257 pitches. As one goes from 1 pitch to 500 pitches, the data becomes more meaningful. However, you need to be quite open and aware that these are not hard, unwavering indications of ability. As such, it is NOT safe to say Wieters is a poor Major League hitter in 0-1, 0-2, and 1-2 counts. It MAY be safe to say that he has not performed well in those situations. There is a subtle, but key point there. Past events correlate ability, but past events are not ability. The weaker the sample size, the weaker the correlation, typically.
- The poster presented Wieters numbers within the context of eight Orioles and two other players. This is not ideal for a comparison population when Baseball Reference provides you with whole league data to devise an average level of performance. To understand how Wieters' performance has been, it is necessary to determine how that performance compares to the league average hitter. The point of doing this is to minimize peculiar collections of data. Additionally, shouldn't we also consider how there is a scarcity of offense at his position? It seems that comparing a catcher's offensive production to other positions with greater production might be a tad bit unfair.
- With the poster done with his findings, I think Tony Pente chose to present some statistics in a manner that could be confusing to his audience. Tony is someone who I find to be quite good at scouting players. I regularly check his views on Oriole minor leaguers. I trust what he sees over many in the industry who look at the Orioles with a less focused eye (e.g., John Sickels). However, I don't think Tony fully recognizes what statistics can and cannot do at times. In one paragraph, he notes that the pitch distribution over 29 plate appearances "tells me the book on Wieters is not to try and throw fastballs by him, but to get him out with offspeed stuff." Pitch distribution over 29 at bats is not a large enough sample size. It can be too affected by pitchers faced, situations, and simple chance. He next paragraph acknowledges the meager nature of the data set with "It's early and the numbers won't be as skewed by year's end, but one thing is certain is that Wieters has been a terrible hitter when down in the count throughout his career and that pitchers are throwing him more and more offspeed pitches..." However, I am not sure what he means here. He writes he is certain of one thing and mentions two. I'd argue without true population context, we don't know if he has been terrible or poor or whatever. Second, pitch distribution over 29 plate appearances should not make one certain that pitchers are throwing him "more and more off speed pitches." Wieters has barely seen a cutter. The same logic would dictate that reports have determined that Wieters crushes cutters and no one will throw them to him. Markakis, Reynolds, and Vlad also have seen fewer fastballs as well. Has the scouting report changed on all of these guys? It would be nice if that data set is robust enough to use, but it is not. Tony might be absolutely correct, but he is not citing anything that supports his notion. This is a case of the existing statistics taken to fit a narrative.
After the jump, I will re-explore the notion of Wieters' OPS once he reaches certain hit counts.
First it is important to recognize what is league average to gain a proper appreciation of Wieters' performance. The following table showing league average OPS in 2010 for all possible counts:
As we probably expected, the closer you are to striking out, the worse the eventual OPS will be. The opposite is true about balls. It is important to note that not all 0-0 pitches are the same. Batters are selective and that enables them to pick and choose pitches, so the higher the OPS means the greater likelihood of a good pitch to hit. This table is sometimes confused by people as meaning that you need to be aggressive and avoid 0-1 or that you need to be patient and get to a 1-0 count. It does not mean that at all. The selectiveness of the batter is what should be dictating whether to take or swing away.
So how does this compare to Matt Wieters?
Now remember, we are comparing league average in 2010 to Matt Wieters career numbers. There are two orders of magnitude greater data points for the league population than for Wieters'. That needs to be kept in mind. It also needs to be kept in mind that Wieters .714 on 0-0 pitches is the only grouping that we have a slight ability to argue that it is robust. Nothing else. Acknowledging that, the graph shows some extremes from the average in the top left corner and the bottom corners. This can be more easily seen when we do a crude(*) OPS+ value by dividing Wieters' line by the league average.
As you can, I took an ugly table and made it uglier. Far uglier. Anyway, I color coded the table to show how counts differ from the league average. Anything greater than 15 away from the average was considered below average (blue) or above average (red). Again, small sample size prevents us from saying anything conclusive here. These are arbitrary designations. However, I think broad weak assumptions can be made (as long as we are open that there may be a great deal of shift in the future). As you can see, his numbers do not nicely fit the expected model of the bottom left corner to the top right corner dictating above to below average performance. The 2-0, 2-1, and 3-1 counts stick out. I'd expect a more robust data set would smooth these out unless Wieters or opposing pitches use approaches that cause these peculiar results.
How do his numbers look when we compare him only to other catchers?
It looks similar to the other graph, but it shows that his past performance has been rather average assuming my arbitrary 15 point ranges are suitable. Again, small sample sizes, so we should expect some level of peculiar data clustering.
Finally, what does the frequency of Wieters' pitch count situations compare to the league in general?
I kept the color pattern from the catcher OPS+(*). What we see is that in the past, Wieters has found himself in bad count situation around a league average level. However, he has found himself in good count situation more often than the league average hitter. I'm not sure what definitively this can say about Wieters approach to the plate. It appears that he might have an above average eye. He has tended to hold off on balls to a better degree than league average.
Merely looking at the numbers, it gives me hope that Wieters will go from being an average offensive catcher to an above average offensive catcher. He appears to show skills that enable him to be put into situations where he has a better opportunity to be successful. However, he may have an issue with his approach in situations where he is severely behind in the count. It should be noted that this level of analysis is dependent to far too great a degree on a handful of numbers that lack robustness. As you can see, my findings are not all that dissimilar to post I linked to. However, I think that post is a bit sensational and it is a detriment to the data that was presented. Sometimes we have to be comfortable with not having a good answer. This article does not provide one.
This is the first article in a series. In the next installment, I will go beyond comparing Wieters to the average player and actually go in and put him within the context of a population. That means stringing out individual season numbers and establishing levels of significance because we know he has performed below or above average in certain situations, but do not know what that means (beyond the fact we lack sufficient sample size). Of course, a major problem here will be survivor effect. I'm not sure how I will or if I will handle that, yet.
(*) means that this number is a simple ratio. There was no attempt to consider park factors.