01 March 2012

Comparing fWAR with rWAR

Here is just a short post today.  People often think of rWAR and fWAR as being equal to each other because they are both trying to determine the overall value of a player's performance.  They use different means, but try to get to the same place.  However, this does not entirely make sense to me because the assumption is that the two metric would have the same statistical spread.  I decided to show them simply side by side below using statistics from 2011.


Below are the WARs for pitchers who qualified for the ERA title.

These two for the most part match up well.  rWAR is a bit more extreme on the ends with fWAR showing a slight bump throughout the middle, particularly with players below a WAR of 2.  fWAR has a mean of 3.2 while rWAR is 2.9, which amounts to a total difference of about 20 WAR between the two statistics.  As populations, they do not appear to be significantly different (p=0.40).

Position Players

Below are the WARs for position players who qualified for the batting title.

Again, we see something similar to the WAR for pitchers where at the extreme ends, rWAR gives greater positive values and lesser negative values while fWAR gives higher values throughout the middle.  The differences between these two populations approach significance (p=0.10).  The average WAR was 3.2 for fWAR and 2.8 for rWAR.  The difference in total WAR was 60 WAR advantage to fWAR.

I think the take home message here is that while you may not be making a grand mistake by doing something like adding them together and dividing in half, you certainly should not think the two statistics are equivalent in magnitude.


  1. Everyone (more or less) agrees on the offense component. Defense, replacement level and pitching (actual runs allowed, or component runs allowed) are up for debate. By averaging you're essentially saying "Let's take the midpoint in this debate; it's close enough."

  2. I think that is an illogical conclusion.

    It is akin to averaging kilometers and miles. Just pick one and go with it or respect the differences of each one. I think it is incorrect to treat them as equal metrics.