WAR Back in the News

At fellow Sweet Spot blog It Is About the Money, Stupid, Hippeaux has a post up critiquing the Sabermetric statistic Wins Above Replacement and its widespread use (or, in his estimation, misuse). Naturally, this spurred a lot of debate on the Internet. Among many others, Rob Neyer and Tom Tango have rebutted the IIATMS article.

I don’t want to rehash the debate as most of it has been said before. However, I read a comment on the Baseball Think Factory thread that I’d like to share, as I thought it was quite good, written by the user named “PreBeaneAsFan”.

I think this is a problem that I see a lot not just in relatively unimportant venues like sports, but also in more important arenas (popular discussions of science, economics, etc.) People correctly point out that we don’t have precise answers and that our best quantifications have error bars that are [larger] than the number of decimal places reported. That’s a valuable insight and worth discussing, but then people take it a step further and use that as an excuse to remain completely agnostic on things. By denigrating the best efforts of others to quantify difficult questions and insisting that “I don’t need all that fancy stuff, just give me the basics and I’ll take my own guess since no one knows” they give themselves a feeling of smugness and superiority to those bookish nerds vainly searching for answers they can’t pin down, but they also throw away valuable information that the effort to quantify those things tells us and in most cases behave as though the uncertainty is much greater than it actually is.

Leave a Reply


This site uses Akismet to reduce spam. Learn how your comment data is processed.


  1. jonny5

    September 07, 2011 07:56 AM

    I have to say, I like the post from PreBeaneAsFan except in his assertion that dismissal is due to smugness and superiority over “the bookish nerds”. I think that’s a real paranoid stance to take and this guy, although on point about the dismissal of some stats has a somewhat sad view of things here.

    I’m pretty sure that the willingness to so quickly dismiss advanced metrics by most who do so is more due to a lack of understanding they have of the stats themselves. Do they seem smug and superior because they are dismissing something you believe in and may have poured over for hours on end? Probably. But you have to remember how human nature works with things they don’t quite understand. In a modern world we don’t burn witches anymore, but those instincts to fight what you don’t quite understand are still there.

  2. Ryan F.

    September 07, 2011 08:50 AM

    This is a very insightful post. I have noticed this phenomenon many, many times. It brings to mind the time I tried to convince my uncle that global warming was a real thing. To buttress my argument, I showed him a graphic showing the correlation between human carbon emissions and a sharp warming trend. His response was, “nice graph… I could find a graph saying the opposite in two seconds.”

    But the point is, they never do. They lazily rest on the confident belief that data supporting their position are out there somewhere; what they believe is so obviously true that there’s no need to even bother backing it up.

    And it even goes beyond casual dismissal. Not only are you an out-of-touch nerd when you try to use data (imperfect though it may be), but you are an arrogant jerkoff worthy of scorn. I’ve met with outright hostility many times in trying to make arguments on the basis of data. Maybe it’s because, down deep, they know they’ve been outmatched in the debate. Maybe not it’s a warped anti-intellectual culture. Either way, it’s deeply troubling, and it’s responsible for far worse things than reactionary baseball antiquarianism.

  3. Jay

    September 07, 2011 09:00 AM

    I’m definitely a hater of SABR, and it has nothing to do with my lack of statistics knowledge, I had to take over 6 semesters of stats in life. Though it is obvious that Sabermetricians, specifically with the use of WAR, overestimate the importance of defense, or at the very least their ability to properly measure it, and also undervalue sluggers. This is the point of the article. You can have the top OBP players in baseball, but if you do not have a true power hitter in your lineup you cannot win. That is the point of this article, maybe you guys should read it.

  4. Ryan F.

    September 07, 2011 09:18 AM

    Jay, the point of the article isn’t really germane. The quote stands on its own. Plus, none of this is an attack on people like you who have reasoned objections to certain statistics. It’s an attack on people who dismiss them out of hand without any basis for doing so.

  5. Dan

    September 07, 2011 09:27 AM

    Jay, while it’s nice to have a slugger in your lineup, I don’t think you can honestly say you CAN’T win without one.

    A bases-loaded walk still scores a run. A single will usually score two. While a grand slam would be sexy, what matters is getting some points on the board. The Phillies have already won a game this year in which they had no extra-base hits, in fact they put up 10 runs on Seattle without a single extra-base hit.

    Am I saying it is easier to score without extra-base hits? Absolutely not. Two bases are always better than one, and a home run is the best thing you can hope for offensively. But to say you can not win without a power hitter is blatantly false.

  6. JB Allen

    September 07, 2011 09:30 AM

    Sort of related: couldn’t WAR be adjusted for baserunner situations (i.e., bases loaded, bases empty, runner on 2nd, runners on 1st and 2nd, etc.)?

    If hitters generally tend to perform better or worse in certain baserunner situations (the way hitters generally tend to perform better or worse in certain ballparks), then wouldn’t you want to adjust a player’s WAR based on the number of times he’s in these situations? For example, hitters tend to hit better with runners on first and second, and Player X hits in this situation far more than the norm, shouldn’t his WAR be adjusted down?

    Maybe baserunner situations don’t matter much, or there just aren’t any general trends, but my take-away on Hippeaux’s article was that WAR is a little weak on context, or that it incorporates factors in isolation that are too reliant on externalities (like the quality of a player’s teammates).

  7. Dan

    September 07, 2011 09:31 AM

    I meant the Cardinals, not Seattle, sorry.

  8. JoeM

    September 07, 2011 10:13 AM

    I read often that WAR overvalues defense and I often feel the same way, but I am not sure I have ever seen compelling data to backup that claim.

    I do strongly believe that the baseline value on a position by position basis needs to be reevaluated. Aaron Rowand earned +0.7 WAR and would have finished the year around +1 on the terrible pace he was on. What this means is that the WAR metric assumes far too little of a replacement level center fielder.

    Anecdotally, I feel that WAR does not sufficiently punish players who are dead weight offensively. Teams can recover from a terrible bat but 2-3 terrible bats in a row destroy innings.

  9. JoeM

    September 07, 2011 10:21 AM

    All that said, for the most part, WAR seems to consistently pass the sniff test and is obviously a very valuable pursuit.

    But like all things, people have to keep their eyes open for things that do not make sense.

  10. KH

    September 07, 2011 10:48 AM

    I personally think the article was a decent attempt at showing the limitations of WAR especially when it comes to defense. Sure it had plenty of crap in it but the part that covered fielding if you dont find that at least somehwhat compelling your are nothing but a SABR bot imo.

  11. Rob

    September 07, 2011 12:04 PM

    The WAR piece isn’t, as PreBeaneAsFan says, completely agnostic on WAR. It just says that WAR doesn’t yet work in the straight-forward way we want it to. I don’t find that to be all that denigrating or dismissive — in fact, with the exception of the slugger bit, the author’s issues with WAR are specific and well-reasoned.

    If readers took it as a validation of their dismissal of ‘all that fancy stuff,’ I think they misread.

  12. jauer

    September 07, 2011 01:07 PM

    “You can have the top OBP players in baseball, but if you do not have a true power hitter in your lineup you cannot win. That is the point of this article, maybe you guys should read it.”

    If that’s the point of the article — thanks. I definitely won’t read it.

  13. jauer

    September 07, 2011 03:24 PM

    Cyd, I don’t think there’s one “SABR-head”, whatever that means, who would call for equivalent plate discipline between those two hypothetical players.

    Although you did help explain why I ended up at FAU this weekend rather than Sun Life Stadium in my attempt to attend this past weekend’s games.

  14. Dave

    September 07, 2011 03:55 PM

    I’m not nearly as smart as you folks, but my only issue with WAR is the ‘perceived’ (see: my opinion. I could be wrong) overemphasis on defense. I would put a lawn gnome in left field if he could give me an OPS over .800.

    Defense seems to be the new, chic undervalued asset these days (see: Seattle/San Diego). I get the importance of run prevention and fielding range – I, too, have seen Jeter play – but at the end of the day, Jack Wilson is still batting 500 times a season.

    I guess my long-winded point opinion is that I think defense/fielding was valued properly. It’s a distant third for me behind OB% and power.

    (Full disclosure: I love Rob Deer, so take this all with a grain of salt).

  15. jauer

    September 07, 2011 04:21 PM

    Except that Votto’s OBP is (and has been) so much higer than Howard’s that there’s no way these “lineup variables” make up for that difference.

    Even acknowledging your lineup variables — strikeouts are always bad and walks are always good. Votto does both significantly better than Howard.

    Perhpas if Howard could come anywhere close to Votto’s OBP, their difference in wOBA could be excused by lineup surroundings.

  16. jauer

    September 07, 2011 05:18 PM

    Definitley didnt come off as personal, no need for apology.

    I agree with you that applying wOBA to Howard and Votto probably underrates Howard, because of his increased productive (whether it’s skill- or infield defense-based).

    “Applying the same value to the variables in the same formula for the two players, is bad math.”

    I wouldn’t call it “bad” math, but rather “less than ideal.”

    I don’t think there’s a WPA-component to WAR (I could be wrong), but I think WPA is something close to what you’re describing. It accounts for base-out situations and would apply equally to base-out situations.

  17. jauer

    September 07, 2011 05:20 PM

    didn’t finish my second sentence, it was meant as this:

    “because of his increased production (whether it’s skill- or infield defense-based) with men on base”

  18. jauer

    September 07, 2011 05:24 PM

    andddd I messed up my last sentence; I meant apply equally to both players

  19. Bill Baer

    September 07, 2011 05:58 PM

    What were the weather conditions, the pitching styles, the physical health, the mental health, ballpark dimensions,of each batter at any given point and all given points in time.

    These tend to wash out over time. They’re very small variables that have no consistency (with the exception of park factors).

    This in return, will make the pitcher appear to be “lucky” in relation to his BABIP.

    In small samples, yes. But there’s a reason why Roy Halladay’s career BABIP is only .005 higher than Adam Eaton’s — pitchers have very little control over it. There are exceptions (like Matt Cain) but the stat has held up to all of the extensive criticism.

    The tools are good, but they are not gospel. Sometimes we just have to accept a player is performing outside the “norms”(whether good or bad),and there may be factors we either can’t or don’t measure, that are in play.

    This is exactly the attitude that is criticized in the OP. People throw their hands up in the air in exasperation, using our lack of 100% precise answers to ramble own with their own preconceived notions.

    There’s a reason why DIPS theory made such headway, because it went counter to the “consensus”. That’s because human beings are flawed and biased, and can’t be trusted by themselves.

  20. Pete

    September 08, 2011 12:41 AM

    Hi Bill,

    I wouldn’t call Neyer’s article a “rebuttal” (which would have addressed the author’s underlying arguments) so much as a “rebuke” (which simply addressed the author’s rhetorical choices). Really, Neyer’s tirade against the use of “we” sounded a little insane. I’m worried for him.

    I think Hippeaux showed what an actual rebuttal looks like in addressing Neyer’s hissy fit: itsaboutthemoney.net/archives/2011/09/07/in-defense-of-the-royal-we/.

  21. hk

    September 08, 2011 06:57 AM

    Other than the title of Hippeaux’s original article comparing WAR to RBI, this whole debate seems to be much ado about nothing. I have read very few people who contend that WAR is either perfect or a conversation end-er. In fact, Keith Law for one frequently notes the inconsistencies of UZR – Hippeaux’s main claim – when discussing WAR. That being said, once we get past the fact that WAR is not perfect and we understand its inconsistencies and figure out how to resolve them in our own heads, it does seem to be one of the best one-stop shopping statistics or calculations available to rate and compare players’ past values and it seems much better than RBIs in this regard.

  22. SP

    September 08, 2011 08:06 AM

    Speaking of WAR, can someone tell me the difference between Fangraphs and Baseball Reference’s WAR calculations? Because Fangraphs have CC ahead of Verlander 6.7 vs 6.4 which is ridiculously wrong. BR has it 7.8 to 5.8 in Verlander’s favor which is accurate in my opinion since he leads CC in every major pitching category except GB% and HR%. Someone at Fangraphs better fix their equation because they look like idiots right now.

  23. EHW

    September 08, 2011 12:18 PM

    To an extent I feel like the reaction to this article kind of DOES show what’s wrong with WAR, even moreso than the article itself.

    The article pointed out that there are some flaws in WAR, mainly with its usage of one-year-samples of UZR. That’s not news to anyone in the SABR community. But I keep seeing people just saying “WAR isn’t perfect, it has its flaws, people are always looking to improve it” as any sort of response to criticisms of the stat. And then…that’s it. The formula for it doesn’t change, and people keep using it the same all-encompassing way, and then when someone points out an error or something that seems off, the same defense comes out, people keep saying that it’s being worked on and improved constantly, etc.

    It’s almost like the flaws in WAR are starting to be used as a safeguard against its criticisms. The SABR community is constantly critiquing WAR itself and is constantly trying to improve it? That’s great and all, but this is the third straight year that fWAR has been out there and available to the public, and unless I’ve missed something (correct me if I have, cause it’s certainly possible) there have been very few actual changes to its formula other than a few minor number tweaks here and there.

    WAR is a good stat. It really is. But I think it was David Murphy, a couple weeks ago in the midst of one of the recent uproars about a Ryan Howard story, who said something along the lines of “WAR is the best attempt out there to quantify something that can’t be quantified”.

    That’s not to say it’s useless. Absolutely not, by any means. But whenever there’s an attack on it talking about the little flaws in it, it’s just frustrating to keep reading “WAR should not be looked at as an all-encompassing stat” from SABR types who then proceed to do exactly that. It’s amusing to see Rob Neyer get so defensive about the all-encompassing nature of WAR when he wrote this piece a month ago comparing Pence and Bourn which centered almost entirely around Bourn’s superior WAR:


    I don’t know. Again, I like WAR as a stat but I know it needs work. And I know that’s what everyone is saying, but it’d be nice to actually see that work come to fruition. Until then it shouldn’t be used as a be-all-end-all sort of trump card, and people should stop using its flaws as means to block criticisms of it if they aren’t actually doing anything to fix said flaws.

  24. Bill Baer

    September 09, 2011 03:58 AM

    Speaking of WAR, can someone tell me the difference between Fangraphs and Baseball Reference’s WAR calculations?

    You’d have to speak to someone at FanGraphs and BR respectively to confirm this, but I believe their sources of data are different.

    As for structural differences: for defense, FanGraphs uses UZR; BR uses Total Zone. For pitching, FanGraphs uses FIP; BR uses actual results.

    I also believe they calculate base running differently, but I couldn’t tell you how off of the top of my head.

  25. SP

    September 09, 2011 08:48 AM

    Ah that makes sense Bill. I figured FG HAD to be using FIP/xFIP in their WAR calculations because that is the only thing CC is ahead of JV in (3.3% higher xFIP). However, JV is ahead in SIERA by 5.4% which I consider more accurate than xFIP. Not to mention K:BB ratio, K/9, BB/9, WHIP, etc.

    Also, it doesn’t seem like FG is appropriately rewarding JV for more innings pitched whereas Baseball-ref does. No way in hell should CC be ahead of JV in WAR and I think FG needs to take FIP/xFIP out of the equation, weight it lower, and/or add SIERA if they want us to take their WAR seriously. Maybe you can pass that along for us Bill!

  26. Bill Baer

    September 09, 2011 08:55 AM

    Personally, I like the variety in WAR that various websites offer (including BPro’s WARP). A lot of people cite this as a problem, but I don’t see it that way. There’s never going to be a perfect WAR, and even if there was, people would complain it’s not accounting for the factors in the way they believe to be correct.

    So, yeah, FG and BR have different WAR inputs, but it doesn’t mean one or the other should change.

  27. hk

    September 10, 2011 07:57 AM

    “That’s great and all, but this is the third straight year that fWAR has been out there and available to the public, and unless I’ve missed something (correct me if I have, cause it’s certainly possible) there have been very few actual changes to its formula other than a few minor number tweaks here and there.”

    I may be wrong, but I believe that adding base-running to the equation is a recent change.

Next ArticlePlease Stop Calling Cliff Lee Streaky