Jim Caple posted some criticism of the Wins Above Replacement statistic over at ESPN this morning. Overall, I thought his criticisms were warranted and well-stated, but I wanted to add some of my own thoughts to clear up any misunderstandings and to add a different perspective to some of the points raised.
My issue is this: I don’t like the increasing over-use of (and over-reliance on) WAR as THE definitive evaluation of a player’s worth.
Certainly this is an issue more apparent with WAR than most other statistics simply by its nature. Smashing offense plus defense plus positional adjustments plus replacement level into one statistic would give one the idea that this one number and one number alone can paint an entire picture of a player’s season or career. However, such reliance on one statistic has certainly been true even before WAR was ever born. Saves for relievers have often been the only barometer by which a closer’s value is judged; RBI’s for hitters; won-lost record for starting pitchers. When possible, you should always use multiple methods of evaluation and be mindful of all their limitations, no matter which stats you’re using.
Moving on to Caple’s direct criticisms…
Almost no one knows how to calculate WAR. [...] If we can’t figure a stat out on our own, then how do we verify whether it is accurate?
WAR is, as far as I know, the most complex statistic used in sports right now. It is very arduous to learn everything that goes into creating it, much less trying to reproduce the results. However, it is not impossible to learn and its complexity is neither a feature nor a detriment. The utility of a statistic should be judged on its ability to describe what happened or its ability to help predict future results. It doesn’t matter if your statistic has one simple arithmetic function or is a concoction of mathematical trickery as long as it does its job well.
WAR does its job well (but not perfectly!). Sort any WAR list from greatest to least, or least to greatest and it will almost always line up with what you observed with your eyes or judged with more traditional statistics. 2012′s WAR top-ten included Mike Trout, Buster Posey, Ryan Braun, Robinson Cano, David Wright, Chase Headley, Andrew McCutchen, Miguel Cabrera, Jason Heyward, and Adrian Beltre. All players you’d have included in a top-ten list most likely. Maybe you disagree with the specific ordering, and that’s fine. As Tango says, “Everyone has their own WAR.” If you think FanGraphs weights defense improperly, you can make your own adjustments. FanGraphs’ and Baseball Reference’s versions of WAR are not sacrosanct.
Disagreements with WAR arise when detractors cite a questionable result (“Miguel Cabrera third in WAR? Nonsense!”) but aren’t transparent and consistent about the methods behind their own evaluations. WAR is useful because you know what’s going into it and the components are applied uniformly and without bias.
Actually, we know it isn’t always accurate because depending on your source — FanGraphs or Baseball-reference.com — you can get wildly different WAR scores.
I would argue this is a feature of WAR. In fact, we should have even more versions of the stat because we will further learn what does work and what doesn’t work. FanGraphs uses FIP in its calculation of pitcher WAR, while Baseball Reference does not, which is why a pitcher like Ricky Nolasco — someone who has under-performed his defense-independent statistics — has 18.1 career fWAR but only 7.8 rWAR. Should we be adjusting pitchers’ results due to park factor, quality of defense, and batted ball luck? The debate isn’t settled, but we are aware of the disparity because we have divergent methods of calculating a similar statistic.
Think of fWAR and rWAR as scientific experiments. When you get different results, you continue to tweak and control for different variables. You don’t pack up your stuff and never experiment again.
If a player’s batting average varied from .245 to .307 from ranking to ranking, would you trust either statistic?
If a statistic is descriptive, as batting average is, there is nothing to distrust. What you must be wary of is instead the interpretation of the data. Ryan Howard hit .313 in 2006 but finished at .219 last season. If those were our only data points, and we were trying to predict his future performance, we would heavily regress him back to the league average. However, since we have much more information at our fingertips, we know that Howard’s batting average was affected by his Achilles injury, his age, the heavy use of the infield shift against him, and his increasing futility against left-handed pitching. That goes to Caple’s above point, that we should not use any single statistic in isolation.
When you see wildly divergent WARs for players, that should be a telltale sign to investigate further, rather than throwing your hands up in defeat.
The fielding metrics used by baseball-reference (Baseball Info Solutions Defensive Runs Saved) seem to lift or lower their WAR scores much more than the fielding metrics used by FanGraphs (Ultimate Zone Rating). And that presents another issue with WAR. [...]
In other words, most baseball stats are based entirely on indisputable math calculations. WAR has an element of theory and assumption to it.
This is a very good point and it’s my biggest issue with WAR. Our methods of evaluating defense are still in their infancy and must be met with a lot of skepticism. Whenever I use WAR for hitters, I make a note of how heavily it is affected by UZR. For instance, Darwin Barney posted 2.5 fWAR in 2012. Broken into specific components, he posted negative 15 batting runs, 4.3 base running runs, and 13 fielding runs. Because of his playing time and position, he was also credited with 19.6 replacement level runs and 2.2 positional runs. I don’t know too much about Barney — I’m not a Cubs fan and don’t get to see too many Cubs games during the season, so I don’t have any prior knowledge about his fielding capabilities. UZR graded him positively but much less favorably last year in a similar amount of playing time, crediting him with 6.1 fielding runs. Upon seeing that, I would be skeptical about UZR’s accuracy of Barney’s defense.
On the other hand, when I look at Chase Utley’s page, I see that he has been an above average (elite, actually) defender throughout his career. He was credited with 5.3 fielding runs last season, a decline from 2007-10, when he ranged from 10-20 fielding runs. Given my prior knowledge, having watched Utley very closely throughout his career and knowing that he has had knee problems, I am comfortable accepting UZR’s evaluation of his defense, while still being skeptical of UZR overall. It would be great to have the specificity of offensive statistics with our defensive statistics, but we just aren’t there yet. However, that is no reason to toss WAR out to the curb, nor is there another method out there proven to be more accurate at evaluating defense.
The same approach should apply to WAR. We need to look at many stats to assess players, and one of them should be WAR. But it shouldn’t be the only stat we look at or cite.
If you take away one thing from Caple’s article it should be this, but it doesn’t just apply to WAR — it applies to all statistics. Don’t just look at RBI; look at RBI opportunities, where the runners are on the bases, and the base running ability of those runners. Don’t just look at xFIP; look at BABIP, park factors, the quality of defense, among many other factors.
Not much to say by way of introduction this week. Only, I suppose, that we ought to listen to more “Waltzing Matilda.” That’s a propos of absolutely nothing, but that doesn’t make it any less true.
@mattjedruch: “you ever thought what it’d be like if baseball was more like soccer in that managers decided on who to sign instead of GM’s? or at least have more of a say as to who is signed?”
You mean, like Kirk Gibson being unable to handle intelligent, talented players? Because the Diamondbacks were about as eager to get rid of Trevor Bauer and Justin Upton as the state of Arizona is to get rid of its Hispanic population.
But in truth, the manager in soccer often has more in common with the GM of a baseball team than he does the field manager. Yes, he dictates tactics and sets the starting lineup, but in a game with only one break in play, almost no structure and only three substitutions, it’s not like there’s a ton of in-game coaching to be done. There’s absolutely some, but not nearly as much as in baseball, let alone a sport like football or basketball where the coach is constantly calling plays and changing personnel.
I think you’d see more managers with an intellectual background rather than a playing background. Right now, the primary qualification for being a baseball field manager (which is a job whose primary functions are administrative and tactical, not athletic) is having been paid to play baseball in the past. I think that would change under the soccer manager model.
Which is not to say that some of the biggest names in soccer management today weren’t superstar players–Pep Guardiola, Carlo Ancelotti, Frank Rijkaard, Jurgen Klinsmann.
But Arsene Wenger, first with AS Monaco in France and for the past 15 years with Arsenal in England, ran circles around his competitors by buying undervalued players, developing them, and selling them at the peak of their value. If this approach sounds familiar, it might not surprise you to learn that Billy Beane views Wenger as an inspirational figure. Wenger is hardly any more a former pro soccer player than I am–he’s an economist by training, which explains how he was able to arbitrage the pants off the English Premier league for almost a decade before everyone else caught up.
His Arsenal won its last title in 2004, and the next two years finished behind a Chelsea team led by Jose Mourinho, who studied since his youth not to be a player, but a coach, and has turned into the best one in the world.
So you’d probably still see a lot of former players as managers, but it would turn into a hybrid role that combines both off-field economic savvy and on-field leadership. I think a the uniformed coaches would take a bigger role, just because the manager would have so much responsibility he’d have to delegate. Which is good, because apart from Don Cooper, I’m not sure I can name another MLB coach who has a big positive impact on his team.
@SoMuchForPathos: “Who should I realistically start dreaming on as the Phillies’ next GM?”
I’ll do it if you want. I’d probably advance the thinking of the Phillies’ front office by 20 years by bringing the team’s quantitative analysis department…well, creating one, for starters. But once I’ve cleaned house, the front office will be up to speed on the top publicly accepted thinking based on the best publicly available data. Which would put the Phillies only about 10 to 15 years behind their competitors.
The arrogant anti-intellectualism of the current regime is baffling and will have predictable results. The Phillies, under Ruben Amaro, are like Czarist Russia, a vast empire of tremendous power and blessed with incredible resources that’s so in love with the fashions and practices of 50 years prior that that power and those resources were, and will be, squandered in a massive blaze and at the cost of unimaginable human suffering. And the next people to take over will be a marginal improvement intellectually but little better practically. And we’ll spend the next 10 years in undignified retreat, watching our crops and oil fields burn because our leaders are too obsessed with superstitions and customs that the rest of the world laughed out of the culture as obsolete before we were even born. The sins of the father, and so on….
But seriously, if I were GM, I’d get to work immediately.
Ben Cherington: “Hello.”
Me: “Hey, Ben, it’s Mike Baumann.”
Cherington: “Hey, Mike. Congrats on the new gig.”
Me: “Thanks. Say, I see you’ve got a lot of question marks about starting pitching.”
Cherington: “I think we’re okay, but I’m listening. I hear you’re pretty high on Jackie Bradley. Would he buy me some pitching?”
Me: “Maybe. How much would you want for him?”
Cherington: “How about I put together a package for Cole Hamels.”
Me: “Not Hamels, but I’d do Cliff Lee.”
Cherington: “Hmmm. I can do something. How about Bradley and…Jerry Sands?”
Me: “Draw it up. Send it over.”
And over the course of hours, I’d make about half a dozen insanely lopsided trades to put the 2010 South Carolina Gamecocks back together, ending with me flipping Jonathan Papelbon to Miami for whatever’s left of Sam Dyson and eating a ton of salary. I’d love the job, but I’d be a disaster.
@sports_j: “has Crash Bag/ Crashburn Alley discussed the possibility/probability of trading Doc if/when Phils are 15 games out in July?
Not yet, but we can if you like.
Halladay’s vesting option will likely not vest for 2014, making it…whatever’s less than a vest…a snood? So if the Phillies are 15 games out in July, wouldn’t it make sense for a team that doesn’t look like it’s going to contend anytime soon to trade an aging, but at the very least competent free agent-to-be to a team with a chance to win it all? It’d make oodles of sense, particularly if he’s on track to throw 225 innings to activate the option, giving him another year’s worth of value to be traded.
But I really don’t think that’s going to happen. First of all, man, that’s a big white flag to wave, even if Halladay’s only the third-best pitcher on the team anymore. For a team 18 months removed from winning 102 games, with no real pressing financial constraints, trading a name of Halladay’s magnitude would be quite bold, even if such a team were blessed with the foresight to realize that The Great Satan, Delmon Young, is not the kind of player who puts you over the top.
So it’s an interesting possibility–and by “interesting,” I mean “soul-crushing”–but I cannot imagine it actually happening.
@MichaelJBlock: “How many more galactically stupid decisions does RAJ have to make before he is relieved of his duties as GM?”
Dozens. General managers don’t get fired for being feckless morons. General managers get fired for losing ballgames, which is amusing as hell, because the lag time between the stupid decisions and the actual losing of ballgames is usually several years, as we’re now noticing. The Phillies lost 2012 long before 2012 actually came to pass. If you look at any team that’s contended for a long period of time, you’ll see gradual but continuous roster turnover. You develop a player, you get some value out of him, you win some games, you trade him for a younger player before he hits free agency, and so on. For the Braves’ run of consecutive division titles from 1991 to 2005, the only constants were Bobby Cox and John Smoltz. Terry Pendleton gets old, so they replaced him with Chipper Jones. Jeff Blauser gets replaced by Rafael Furcal. Alejandro Pena turns into Mark Wohlers turns into John Rocker, and so it goes.
The Phillies drafted phenomenally well around the turn of the century and got a bunch of really fantastic players all about the same age, and they hit their prime together–Utley, Hamels, Rollins, Howard, Jayson Werth–and they rode that for about as long as you can ride a core group wholesale. And while there was roster turnover, it wasn’t young players being eased into key roles to eventually replace the veterans, it was young players being traded for established stars, who make more money and tend to turn into financial boondoggles with little or no warning.
Domonic Brown could have been to the Phillies as Andruw Jones was to the late-90s Braves, the second-generation dynastic star who’s brought up as a complementary player and carries the team into the next plane of spiritual existence. But instead he was left to wither on the vine while a parade of older, more expensive, not-particularly-more-productive alternatives paraded by as the Phillies’ clubhouse turned into a Lexus dealership–a room full of old people with delusionally inflated self-importance and commodities that aren’t worth half what they were purchased for.
And the Phillies still won 102 games in 2011. So as Mephistopheles comes back to visit Ruben Amaro, it’ll take another couple years for the Phillies to bottom out, then another couple years for him to come to terms with the fact that you’re not going to win a lot of games by putting band-aids on traumatic wounds. Or, in Delmon Young’s case, filling traumatic wounds with biochemical waste. Then a few beyond that for him to try a rebuild, because the Phillies’ ownership, which seems to be, like the fans, under the mistaken impression that Amaro was the architect of the greatest set of teams in frachise history, will give him a chance to fail at that. Which he will.
Listen–Ed Wade and Omar Minaya lasted forever as general managers. Ruben Amaro isn’t done with us by a damn sight.
@Major_Hog: “Sign Brian Wilson. He’s still better then either Young and his beard is sublime!”
Yeah, let’s do that. On second thought, let’s not do that. Wilson is a big name with lots of saves in his career, which means that he’s going to more money than he should. He’s also entering his age-31 season coming off his second Tommy John surgery, which is not the end of the world, but given the depth of the Phillies’ bullpen right now, I just don’t know why the Phillies think they need another reliever, that most fungible of athletic commodities. And then they went out and signed Chad Durbin, so fuck me.
But mostly I don’t want Brian Wilson because I can’t stand him. His beard isn’t sublime. It lacks subtlety–it’s dyed darker than his hair, for one thing. Brian Wilson appeals to people who confuse weirdness with cleverness. Who think that being outspoken is the same thing as being funny or insightful. Brian Wilson is the morning zoo drive-time radio show of relief pitchers. He is, to quote Filmdrunk, “More SPROIOIOIOIOING than AH-OOO-GAH!” I will not suffer him on my baseball team.
@erhudy: “what is the name of the disorder that compels you to make horrible puns”
It’s called being really smart, really well-educated and really clever, and it brings with it judgment and ostracism from society that are far worse than the symptoms themselves. Imagine leprosy, but good.
@JustinF_LB: “Who are you rooting for to win the Super Bowl? Ravens or Niners? Ray Lewis’s team or Chris Culliver’s team?”
Culliver’s. As you might have figured out by now, I’m a huge South Carolina homer, and maintain a slavish devotion to anyone who played football or baseball there while I was in undergrad. Culliver, even before he had his Shavlik Randolph moment, was the one exception.
We looked at Chris Culliver, who was Steve Spurrier’s first five-star recruit, as potentially the kind of gamebreaking offensive presence that would allow a team that had been Percy Harvined to death for my first couple years in Columbia, to compete with Florida and Georgia. He was a wide receiver recruit who ran the 40 in the mid-4.3s out of high school, and I ached to see him play.
As a freshman, he returned kicks, and he was terrible. He’d field the ball and run it out, no matter where he caught it, and head straight for the highest density zone of the coverage unit where, invariably, he’d be tackled. He never really played wideout, instead switching to free safety, where he was even worse. I spent my senior year in the press box watching Culliver run the wrong way in coverage, never step up in run support, never make a play on the ball, and always either jump on the pile after the runner was down or tackle a receiver after he’d already gained a first down, then jump up and celebrate like he’d just nailed the triple Salchow to clinch the gold medal in the free skate. The 2008 Gamecock football team sent quite a few admirable, tough, thoughtful guys to the NFL–Kenny McKinley, Ryan Succop and Captain Munnerlyn stick out in my mind–but Culliver was the dumbest, most irresponsible, most disappointing player on a team that included Stephen Garcia. I have cursed his name long after I graduated, and I’ve cursed his name throughout his NFL career.
With that said, being a homophobe is much–I was going to say “better” but instead let’s go with “less bad”–than being an accessory to murder. And since the narrative of this Super Bowl is about Ray Lewis as much as anything else, I find Lewis’ antics to be precisely the kind of showy, fake-Gladiator bullshit that really makes me hate football sometime.
And as much as I really could not be bothered to care about the religious beliefs and practices of 99 percent of professional athletes, I’d like to contrast Lewis’ showy piety with the showy piety of Tim Tebow. From what I’ve read, Tebow wears his religion on his sleeve because it’s part and parcel of his identity, and always has been. Lewis noisily underwent a religious awakening, and while I am really in no position to call shenanigans on his Road to Damascus moment, I would like to offer the following quote:
““And when you pray, do not be like the hypocrites, for they love to pray standing in the synagogues and on the street corners to be seen by others. Truly I tell you, they have received their reward in full.But when you pray, go into your room, close the door and pray to your Father, who is unseen. Then your Father, who sees what is done in secret, will reward you.And when you pray, do not keep on babbling like pagans, for they think they will be heard because of their many words.Do not be like them, for your Father knows what you need before you ask him.”
That’s Matthew 6:5-8, and it’s made me extremely skeptical of, and uncomfortable around, people who make their religious experience, whatever it may be, about other people and not about them and God. I have no time or love for Ray Lewis, and that cancels out whatever affinity I might have for Ed Reed, whose career is as underrated as Lewis’ is overrated, or key Ravens with local ties, like Joe Flacco and Ray Rice, or that Jim Harbaugh is about 80 percent as obnoxiously hypercompetitive as Lewis, or even that I know quite a few Ravens fans, and like all of them, and I generally like it when people I like are happy. I can’t root for Ray Lewis’ team. Can’t do it.
Yeah, so does anyone have any questions that don’t make me want to kill myself?
@tbroomell: “so arsenal just bought a player who’s first name is Nacho, so it got me thinking, top 5 baseball names ever”
Okay, I know Nacho is a food, but apparently it’s also a nickname for Ignacio in Spanish. The Spanish do a lot of things that we in America don’t do, like fascism and idleness and leaving your shirt unbuttoned, but it’s their language and they can do what they want with it. If my name were Ignatius, you could call me Chips and Guac–anything is better than that name. Besides, what language do you think the name for the food came from?
But yeah, my top 5 favorite baseball names ever. This is by no means an exhaustive list, so if I’ve missed a favorite of yours, please alert me/the other readers in the comment section and we’ll have a good marvel and a chuckle together. Here goes:
You know, I’m really not much of a musical theatre person. I was in a production of Godspell once, which was about as much fun as I’ve ever had, though Liz Roscher tells me that everyone’s been in Godspell so it’s not that big a deal. Plus I was in pit orchestra for The King and I and A Funny Thing Happened on the Way to the Forum in high school. I don’t remember a thing about the latter play, except apparently there was provocative dancing by attractive girls in skimpy outfits. Or at least so I’m told, because unlike most other musicals my high school did, the orchestra books for A Funny Thing…divided the woodwind part not by instrument, but into five generic woodwind books that contained parts for one or more of flue, clarinet or saxophone. I got stuck with one that was clarinet, bass clarinet and bari sax, because I played all three of those at the time. This not only meant that I had to tote two huge cases to and from rehearsals, but during “The House of Marcus Lycus,” which was the song with all the provocative dancing, I had my back to the stage and my eyes on the sheet music because the bass clarinet was the only instrument that played the whole song.
I never saw the dancing. This was devastating to me at the time, because not only was I a 16-year-old boy, I was the kind of 16-year-old boy who played more than one woodwind instrument and volunteered for pit orchestra.
I guess that’s kind of a long-winded way of saying that I don’t know that I’ve seen five musicals in person, though I have seen several film adaptations and perused even more soundtracks, many of them at the behest of my younger brother, who is a massive theatre buff. I will say that I absolutely despised Rent.
Anyway, my top 5 favorite musicals, though I’m not really expert enough to comment intelligently on quality:
1776. Because it’s a musical for people who don’t really like musicals all that much.
The Producers. Because I love Mel Brooks and to my knowledge there hasn’t been a stage adaptation of Robin Hood: Men in Tights.
Godspell. Just because of my own emotional attachment to the show.
My Fair Lady.
West Side Story. Because all racially-motivated gang violence should involve such great choreography.
I really should get around to seeing Pirates of Penzance and Candide, though, because both have fantastic music.
@DashTreyhorn: “Favorite character and season of Friday Night Lights?”
There’s apparently been an epidemic of binge-watching this show going around the Philly Sports Twitterverse, by which I mean I did it a couple months back and I’ve noticed 3 other people doing the same since then. It’s really a great show that completely sucked me in. Its biggest flaw is being a little too saccharine and prime-time teen soap opera-ish at times, but I liked it for different reasons than I liked other critically-acclaimed dramas from recent years. For my money, Mad Men is the best TV drama I’ve ever seen, just because of the maniacal attention to detail in the writing, acting and direction. It is the pursuit of perfection, and its best episodes come damn close. There’s one scene from Season 5′s “Commissions and Fees” that is, in my opinion, the best scene of narrative fiction in any medium that I’ve ever encountered.
While Mad Men is extremely literary, The West Wing and The Wire were extremely smart. Both were as much pieces of social commentary (though about completely different components of the cultural and political spectrum) as pieces of narrative fiction.
Friday Night Lights, however, is fairly smart at times, and well-written, but I liked it because of how wholeheartedly and earnestly emotional it was. I thought my way through The Wire but I felt my way through Friday Night Lights, if that makes sense, and when it was over, I was really sad that I’d never encounter those characters again, because I liked them so much. I cried during the series finale even though it was a pretty awful episode, because I was sad the story was over.
Anyway. My favorite parts. I think the first season was my favorite, though like The West Wing, there wasn’t really one season for me that stood head and shoulders above the others, and there were things that I couldn’t stand about the seasons I liked best.
So given that I liked the first season best, it’s kind of weird that my favorite character on the show is Jason Street, and here’s why. I liked Jason Street so much because watching him develop a personality was unbelievable fun, and he’d be far and away the most likeable character on any show that didn’t have Matt Saracen (who’s the most likeable character in the history of TV, including, like Elmo), Tami Taylor and Tim Riggins. Plus Luke Cafferty is pretty easy to root for.
Now, Tim Riggins is probably the most imaginative character, probably the best-written character, and acted well enough that I’m kind of puzzled by how bad Taylor Kitsch was in everything else I’ve seen him in. He’s everyone’s favorite character, just like Omar is everyone’s favorite character in The Wire. But while Omar was great fun, I never considered him to be as interesting and deep a character as Stringer, and I kind of feel the same way about Riggins and Jason Street. In a show full of likeable characters, Street gets lost in the shuffle a little bit, and I can’t really figure out why.
Also, and this doesn’t get said enough, this show falls apart without Landry Clarke and, to a lesser extent, Tinker in the later seasons.
@kgeich67: “If I sent you a Delmon Young t-shirt jersey what are some things you would do with it? Be creative.”
Probably wear it out, get rip-roaring drunk and punch a Jewish person in the face. Because that seems to be a favorite pastime of The Great Satan, Delmon Young.
I’d wear it, to be honest. Maybe just around the house, and I’d pretend that it somehow made me dark and dangerous, like a fake mustache in the movies. I’d put on the Delmon Young shirsey, a blazer and cowboy boots, wear aviators and smoke outrageously large cigars while drinking scotch or vodka–and I really don’t much care for either, but I get the impression that that’s what badguys drink in the movies. I’d adopt an emotionally abusive attitude toward women, and women would flock to me, because my aloofness would remind them of their fathers. Essentially, I’d become the hero of Who Will Survive, and What Will Be Left of Them by Murder by Death.
Okay, someone needs to send me such a shirsey, because that sounds like a riotous good time. DM me and I’ll give you my address and shirt size.
That’s all for this week, and I still didn’t get to that Jonathan Singleton question I’ve had in the hopper for 3 weeks now. Well, you’ll have to check back next Friday morning for that and more. Until then, enjoy the big handegg game on Sunday. Eat something with hot sauce on it for me.