The Chart That Launched A Thousand Ships

Much has been made about Sean Forman’s article in the New York Times that took Ryan Howard down a peg. Forman deconstructed Howard’s glamorous RBI totals, illustrating that the statistic is more a function of opportunity than skill. I don’t wish to rehash the arguments about Howard and RBI that I’ve had before, but I made a chart that I found quite interesting.

The above plots every qualified player’s WAR along with his RBI total. As the trend line indicates, you can see a positive relationship. The more RBI you have, generally the more valuable you are to your team. The r-square, or coefficient of determination, is 0.2265. That is to say, generally speaking, 23 percent of a player’s value is explained by RBI (and factors relating to RBI).

To traditionalists, that will seem very low; to Saberists, it will seem high. There are a bunch of caveats with this, of course, such as a biased sample (only one year), but it paints a good enough picture. That red diamond you see is Ryan Howard. He is all by himself, with the most RBI but not nearly as much WAR as other players with similar RBI totals. In fact, a lot less.

That Howard is an outlier is enough to make people take one look and swear off Sabermetrics forever. But to immediately discard a theory because it doesn’t match up with your preconceived notions is a fool’s errand. All progress you see and have seen is because people set aside what they think they know about their world and open their mind to new possibilities.

In statistics, we accept that in every sample, there are going to be outliers, pretty much no matter what. The 68-95-99.7 rule tells us that in normally distributed data, approximately 68 percent of the data will be found within +/- one standard deviation of the mean; 95 percent within two standard deviations, and 99.7 percent within three. If you take a look at this table for higher deviations, you’ll see that all of the data can never be found within any deviation range.

To laypeople, an outlier is a sign of failure, that the stat is doing something wrong. Unless your statistic claims to account for all universal factors, how can that make any logical sense? I believe this is the biggest obstacle for laypeople when it comes to accepting Sabermetric principles. They see Howard with a MLB-best 95 RBI and comparatively-low 1.4 WAR and cannot reconcile the two.

Wins Above Replacement is far from a perfect metric and anyone that tells you otherwise does not understand the statistic. In fact, any self-proclaimed Sabermetrics adherent that tells you that the stats we have now can explain anything and everything is a crazy person. However, Sabermetrics are a cut above traditional stats, such as RBI and won-lost records. Sabermetric stats don’t have to be perfect, or even extremely accurate, for you to discard your older, more familiar but incredibly flawed metrics.

Let’s do some critical analysis of the RBI stat. Runs batted in. What does it tell us? Simply, how many teammates the player in question helped reach home plate.

Now, what does RBI not tell us? It doesn’t tell us:

  • How often the player in question has other runners on base
  • The base running skill of the runners the player is driving in
  • The scoring opportunities of the player’s hits (i.e. a player who gets a lot of extra-base hits is more likely to drive in runners than a singles hitter)
  • The player’s common spot in the batting order
  • The quality of opposition
  • Effects of ballparks on run-scoring

The Phillies’ number one, two, and three hitters in the batting order have on-base percentages of .336, .348, and .343, respectively. If, instead, Howard had hit fourth in the batting order for the Washington Nationals, with 1-3 OBP’s of .269, .289, and .352, would we still expect him to have 95 RBI?

In another alternate reality, let’s imagine that the OBP stays constant, but in one lineup Howard has three Jose Reyes clones ahead of him; in the other, three Adam Dunn clones. Each has an OBP of .340. Would we expect Howard to drive in the same amount of runs with each team?

Let’s imagine Howard switches over to the AL West. Everything stays constant except the ballparks. Instead of playing at Turner Field, Citi Field, Sun Life Stadium, and Nationals Park, Howard is now hitting in Oakland-Alameda County Coliseum, Safeco Field, Angel Stadium of Anaheim, and the Rangers Ballpark in Arlington. Don’t you think that the more pitcher-friendly parks of the AL West would have an impact on Howard’s RBI total?

If any of the examples above make sense — and I should hope that they do — then the flaws in RBI are apparent. Saberists are often accused of holding up particular stats — flawed ones — as the be-all, end-all of player evaluation. But when the same people making those accusations fall back on RBI, they are holding Sabermetrics up to a double standard. You don’t have to accept every tenet of Sabermetrics, or even Sabermetrics at all, to admit that the RBI stat is extremely flawed. All Saberists ask of you is to be consistent when you apply your criticism. I think this is at the crux of the emotional debates that pop up every time Howard and WAR and RBI are mentioned in consecutive sentences.

Be critical of Sabermetrics. It is always good to look at the world from a skeptical point of view; it is a necessary biological trait that has allowed the human species to prosper. But be level when you do so. Don’t hold Sabermetrics up to a standard you wouldn’t be willing to or are incapable of living up to yourself.

NLDS Choices: Diamondbacks vs. Giants

As Phillies fans looked towards this three-game set with the Arizona Diamondbacks, there was one suggestion frequently made: the Phillies should tank the series to screw over the San Francisco Giants. The Giants, of course, kicked the Phillies out of the NLCS last year. Additionally, they unnecessarily started a bench-clearing brawl with the Phillies recently, adding to the bad blood between the two teams’ fans. At the moment, the Diamondbacks lead the Giants by two games and would match up with the Phillies in the NLDS if the season ended today. The only way the Phillies wouldn’t face an NL West team is if the winner of the NL Central finished with a worse winning percentage than the winner of the NL West (assuming the Atlanta Braves take the Wild Card).

But are the D-Backs enough of a pushover where the Phillies should want to meet up with them over the Giants in the post-season? I’m not so sure. The D-Backs have a +27 run differential, better than the Giants’ -9. While the Phillies smash both of them at +137, the D-Backs are the tougher match-up simply based on run differential.

Comparing both teams’ starters at each position reinforces this point.

Going by wRC+ (the wOBA-based version of OPS+ where 100 is average and above is above-average, below is below-average), the D-Backs have the better hitter at six of eight positions. Note that the D-Backs have had to use various first basemen, now sitting with Paul Goldschmidt at the moment. The Giants have had their share of turnover as well, with Eli Whiteside getting the lion’s share of the playing time at catcher since Buster Posey was railroaded by Scott Cousins in late May. Recent acquisition Carlos Beltran has been sidelined as well and may go on the disabled list soon if he doesn’t see improvement.

This comparison uses xFIP-, which is an xFIP-based version of wRC+ where lower is better and 100 is average. It should come as no surprise that the Giants grade out better here, but the D-Backs are no pushovers. Currently, three of their starters are vastly out-performing their xFIP: Ian Kennedy (-0.48), Joe Saunders (-0.56), and Josh Collmenter (-0.59). While Giants pitchers are also out-performing their xFIP, some of it is better explained by batted ball abilities, defense, and park effects. (See my examination of Matt Cain at Baseball Prospectus from February.) On an interesting note, the D-Backs recently had to deal with the injury to Jason Marquis. They have many options to choose from, including Zach Duke and Micah Owings, as well as prospects Jarrod Parker and Wade Miley.

Again, not really a surprise that the Giants lead here. However, closer Brian Wilson has been vastly out-performing his xFIP. Compared to the last couple years, Wilson’s strikeouts are way down and the walks are way up, but he is still getting results. That could have a lot to do with the cavernous confines of AT&T Park as much as anything — Wilson’s road ERA is more than a full run higher than his home ERA. The Giants’ real stud has been Sergio Romo, whose 1.67 ERA is, stunningly, exactly in line with his 1.63 xFIP. His strikeout-to-walk ratio is over 13. The D-Backs don’t have nearly as much dominance late in the game, but J.J. Putz has been solid with good peripherals including a 3.47 xFIP.

From the Phillies’ perspective, choosing between the two teams is a bit of “pick your own poison”. While the Phillies would be the overwhelming favorites in any match-up, they would need to muster up some offense against the Giants, or they would have to attempt to completely silence the potent D-Back bats, something few teams have done so far this year. Either way, the Phillies’ biggest opponents in the post-season will be themselves and randomness in the universe. Whether it’s the D-Backs or Giants, the Phillies have to take care of themselves first.