SIERA Through 17 Games

Last year, Matt Swartz and Eric Seidman came up with a great pitching metric called SIERA, or Skill Interactive Earned Run Average. It is a lot like FIP and xFIP, but it specifically accounts for a pitcher’s batted ball skills, as well as his strikeout and walk rates. Last year, Roy Halladay led all of Major League Baseball in SIERA at 2.93, although that was significantly higher than his 2.44 ERA. Cliff Lee finished fourth; Cole Hamels 11th; and Roy Oswalt 14th. After the Phillies signed Lee in the off-season, I wrote, “The Phillies have one-third of baseball’s top-12 pitchers from 2010” (at the time of the writing, Oswalt was 12th; there may have been slight tweaks in the Baseball Prospectus database that altered the rankings slightly).

The fearsome foursome could certainly make up one-third of baseball’s top-12 (or 14, if you’d prefer) in 2011 as well. It is still too early in the season to tell, though. In fact, Baseball Prospectus has not yet posted the 2011 SIERA leaderboard (here is last year’s). I, however, am curious and used my handy-dandy spreadsheet to take a quick look. Here are the inputs:

Halladay 117 25 5 45 22 3
Lee 132 28 4 37 32 8
Oswalt 71 14 4 26 16 5
Hamels 73 18 5 22 12 1
Blanton 80 14 4 31 12 3

TBF: Total Batters Faced; SO: Strikeouts; BB: Walks; GB: Ground Balls; ofFB: Outfield Fly Balls; ifFB: Infield Fly Balls

The output, in SIERA, which is scaled to ERA:

  • Halladay: 3.18
  • Lee: 3.24
  • Oswalt: 3.70
  • Hamels: 3.13
  • Blanton: 3.65

Perhaps surprisingly, Hamels has been the best of the bunch so far, contrary to his 4.32 ERA. He is the victim of a .367 BABIP, pitching quite well otherwise — getting a lot of swings and misses, being stingy with the free passes, and inducing a bunch of grounders. My Cy Young pick for the National League, Hamels is in for some regression in the BABIP department, but it should be slightly counter-balanced by his home run rate, as the lefty has yet to allow a round-tripper.

Least surprisingly, Halladay came within a hair of first place in the Phillies’ rotation, in terms of SIERA. Not quite the swing-and-miss maven, Halladay instead found success in rarely issuing walks and getting ground balls in bunches. He is the odds-on favorite to once again lead the Majors in SIERA. Should that happen, expect yet more hardware to appear on Halladay’s mantle in the off-season.

Lee rounds out a tightly-packed top-three. His strikeout and walk rates are better than Halladay’s, but lags behind in SIERA due to his sub-40 percent ground ball rate — roughly 13 percent lower than Halladay’s. He has been a bit BABIP-unlucky, so you should expect his 3.91 ERA to drop quite quickly.

Blanton ranks fourth, perhaps surprisingly. He has actually been quite good: his 7.3 K/9, 2.1 BB/9, and 54 percent ground ball rate are excellent, especially for a #5. Blanton is the most unlucky of the Phillies’ five, sitting with a .373 BABIP and the highest HR/FB rate on the team (13 percent).

Oswalt, who has recently been bothered by back problems, is in fifth. His strikeout and walk rates are good, as is his ground ball rate, but are not quite as good as that of Hamels, Halladay, and Lee when taken together. Additionally, Oswalt has been BABIP-lucky. His .240 BABIP should eke its way towards .300 in future starts, but it would be nice if Oswalt could continue his nice streak of luck that started when he joined the Phillies last season.

Finally, let’s have a quick peek at SIERA for the Phillies’ eighth- and ninth-inning guys, Ryan Madson and Jose Contreras.

  • Madson: 2.26
  • Contrears: 2.53

Both have been great thus far. The Phillies are fortunate to have two extremely good arms pitching in most of the high-leverage innings.

Note: I used batted ball data from FanGraphs, which had not yet updated with information from Wednesday’s games. As such, I went into the play-by-play from yesterday’s afternoon match against the Milwaukee Brewers and interpreted the data myself. The data is subject to human error, which could be significant given the small sample sizes. If any errors are spotted, feel free to point them out in the comments.

Leave a Reply



  1. The Howling Fantods

    April 21, 2011 07:37 AM

    Agreed. Keep it up!

  2. LTG

    April 21, 2011 08:06 AM

    Could you explain why we should reason strictly in terms of luck w.r.t BABIP? I would think that part of the explanation of Blanton’s high BABIP is that he has thrown a lot of pitches in the middle of the plate or up in the zone. Those pitches are much more likely to turn into line drives, hard hit grounds balls, and gappers. But this is not luck; it is bad pitching. Something similar could be said in the other case where a pitcher has a low BABIP. Why shouldn’t we demand a more fine grained stat that distinguishes between bad luck and skill? (Btw, I’m tempted to disregard SIERA just because it ranks Blanton higher than Oswalt. While it might measure the general trends of what successful pitchers do, it can’t account for the difference between pitchers like Oswalt who can keep hitters from barreling a ball and pitchers like Blanton who often can’t.)

  3. Nik

    April 21, 2011 08:09 AM

    I’d be curious to see who’s been our worst pitcher. What are the SIERAs of the Herndon/Kendrick/Baez triumverate?


    April 21, 2011 08:09 AM

    Pretty interesting that Hamels has the best SIERA so far this year, and i’d be curious to see what the numbers for the staff will look like by seasons end. Honestly I wouldn’t be surprised if Hamels,Halladay, Lee, and even Oswalt generate some buzz in the Cy Young voting. But then again, I expected Hamels to get some Cy Young votes last year…but he appeared to be largely over-looked due to his W/L record, despite the fact that all his other numbers were excellent and worthy of votes. Even if Cole has the best year of the 4 pitchers this year, he’s going to have a hard time getting serious Cy Young consideration unless he wins 15+ games. Even though Felix Hernandez won the AL Cy Young last year with a mediocre W/L, I still think there’s going to be some resistance from some old-timers who still think a W/L record actually means something major.

  5. JB Allen

    April 21, 2011 08:50 AM

    Thanks for the SIERA tutorial. Why aren’t ground balls broken down like fly balls are? Aren’t line drives a different beast than slow dribblers? Would breaking this down help address the BABIP discrepancies, or has history shown that this distinction isn’t meaningful?

  6. Mick Shmitt

    April 21, 2011 09:01 AM

    Joe Blanton has not “been quite good.” That is just silly. And while luck may contribute to his bloated BABIP, bad pitching through three games contributes as well. To what degree or percentage each contributes, nobody can tell. But to imply that Blanton has just been unlucky is not accurate. 24 hits and an ERA over 7 in three starts is not good, no matter how you spin the stats.

  7. Phylan

    April 21, 2011 09:58 AM

    This statistic disagrees with my preconceptions, therefore it must be wrong.

  8. Phylan

    April 21, 2011 10:08 AM

    Anyway, LTG, only about 12% of BABIP for pitchers is attributable to skill ( Blanton’s line drive rate is right at his career average, and it happens to be within .3% of, for example, Roy Halladay’s career average. Pitchers have to throw balls in the strikezone. “He’s throwing them right down the middle” and “up in the zone” are conclusions people arrive at after the fact to explain adverse outcomes, rather than based on any actual analysis of pitch location.

    Whether or not you want to believe it, hits and earned runs don’t even begin to tell the full story on how a pitcher is doing.

  9. JB Allen

    April 21, 2011 11:02 AM

    Phylan – Point taken about luck in BABIP, but “only” 12% is attributable to skill? Over the course of a season, isn’t 12% kind of a big deal? The article you link to also notes that defense accounts for about 13% of BABIP, and changes in teams’ defense have had measurable impacts on those teams’ success.

  10. KH

    April 21, 2011 12:01 PM

    Anybody see Eric Karabell’s blog post on ESPNs Sweet Spot trying to defend his comments that Hamels and Oswalt aren’t ace type starting pitchers? Dovetails pretty nicely with this post about SIERA. People are still way too hung up on traditional statistics. Both Hamels and Oswalt are squarely in the top 10-20 starting pitchers in baseball which to me is squarely ace territory.

  11. Dan

    April 21, 2011 12:29 PM

    @KH, I did see it, and attempted to make a post about have Hamels actually has better numbers than one of his listed aces (Price), but I forgot ESPN took away my posting abilities (which is laughable in and of itself).

    @COAL, I don’t think it was that people thought he wasn’t deserving of it, so much as they all agreed that Halladay earned it, and they’re on the same team. It’s understandable (although wrong) that they wouldn’t want to give too much credit to just one team.

  12. Phylan

    April 21, 2011 12:57 PM

    The problem with the “ace” stuff is that no two people have a common definition, the criteria vary massively from person to person, and people who just “feel” that a given pitcher isn’t an ace will make up more criteria to exclude them.

  13. Rob

    April 21, 2011 01:12 PM

    @JB Allen

    I also think people tend not to consider the 10-12% of BABIP that might be within a pitcher’s control.

    For instance, Cole Hamels’ 2009 BABIP was only ~10% higher than his career average, yet it’s pretty well agreed upon here that he was unlucky that year.

  14. LTG

    April 21, 2011 02:24 PM


    1) Thanks for some clarification of my questions.

    2) The snark is unnecessary, and I think you misinterpreted what I wrote. I posed questions expecting they had reasonable answers not as a counter-argument. If alternate explanations cannot be raised in order to be rejected (or even retained) then learning cannot take place.

    3) Pitch-tracking seems to be a way to verify location without reasoning backward from bad result to bad pitch. Further, Rich Dubee stated publicly that Blanton had not been locating his pitches well. This is at least evidence in favor of offering an alternate explanation.

    4) Stats are only good as a reflection of the phenomena. I agree that the statistical revolution has brought baseball statistics more in agreement with the relevant phenomena for evaluating performance. But I have specific concerns regarding a) BABIP’s ability to discern between luck and skill b) the fact that using general trends to account for particular cases always leads to ignoring outliers that could be significant because they undermine the universality of the theoretical framework. Thus, it is always relevant to ask questions about what accounts for the anomalies between theory and appearance. Otherwise, theory becomes just one more preconception.

    In sum, I’d appreciate being treated with the same respect that everyone else is due and being interpreted in a generous way. I have no agenda or bias but simply am an interested fan trying to understand the game and the Phillies better. If rational analysis and argument are valued here, then I’d suggest snark is in foul territory.

  15. Phylan

    April 21, 2011 03:09 PM

    It’s not BABIP’s job to discern between luck and skill, it is our job to do that. If you read the link I posted closely, particularly the bottom part (although admittedly it’s fairly math heavy) it’s actually a rather simple exercise to quantify the proportions with which luck, skill, park factors, and defense contribute to BABIP. If you meant to suggest that Blanton is an “outlier” and therefore his inflated BABIP is significant, I would point to his career BABIP, which sits right a .301.

    Also, what do you mean by “pitch-tracking” in this context?

    That first bit was more in response to Mick Shmitt’s unhelpful comment. Sorry if you were offended but I really wasn’t being disrespectful.

  16. Bill Baer

    April 21, 2011 03:43 PM

    @ LTG

    Could you explain why we should reason strictly in terms of luck w.r.t BABIP?

    Luck may be a bit of a misnomer. Luck is certainly part of it — a big part, I’d say — but not the only part. You mention that Blanton has been leaving pitches over the plate. Phylan was correct to say that that conclusion is something a lot of people make after the fact, especially without consulting pitch data. If a pitcher is really bad about leaving pitches over the plate, we would see that show up in his statistics: a consistently above-average BABIP in a large sample size, or an above-average HR/FB, etc. (a la Adam Eaton). That’s not the case with Blanton. We only have 17 innings of data for him, so our error bars for his stats are huge. Even if Blanton truly is leaving too many pitches over the middle of the plate, we should expect that to even out over the course of the season.

    (Btw, I’m tempted to disregard SIERA just because it ranks Blanton higher than Oswalt. While it might measure the general trends of what successful pitchers do, it can’t account for the difference between pitchers like Oswalt who can keep hitters from barreling a ball and pitchers like Blanton who often can’t.)

    If Oswalt truly had a legitimate skill to prevent hitters from “barreling” the baseball, we would see that show up in his peripherals. However, his career BABIP is only .003 lower than Blanton’s (.295 to .298) and his HR/FB rate is 0.5% lower (9.1% to 9.6%). Oswalt has been more successful than Blanton due to much better strikeout and walk numbers, and slightly better ground ball rates.

    @ JB Allen

    Why aren’t ground balls broken down like fly balls are?

    In a perfect world, the batted ball data is 100% reliable and put into even buckets, but that’s just not the case. I would certainly be interested to see if some pitchers induce weaker ground balls than others, assuming it’s a skill that shows up over several years of data.

    @ Mick Shmitt

    Joe Blanton has not “been quite good.” That is just silly. And while luck may contribute to his bloated BABIP, bad pitching through three games contributes as well.

    What Sabermetrics attempts to do is separate performance from results. Blanton’s ERA may be ugly, but it could be due to a number of factors: bad luck, poor defense, pitching in an egregious percentage of hitter-friendly ballparks, etc. Stats like xFIP and SIERA attempt to isolate the factors we know a pitcher controls, which are his strikeout, walk, and batted ball rates. They are the best predictors of future performance.

    24 hits and an ERA over 7 in three starts is not good, no matter how you spin the stats.

    Similar arguments were made about Hamels in 2009. Generally speaking, pitchers just do not have a lot of control over the rate at which they allow hits. Kyle Kendrick has a career .290 BABIP while Halladay is at .292.

    @ LTG

    Further, Rich Dubee stated publicly that Blanton had not been locating his pitches well. This is at least evidence in favor of offering an alternate explanation.

    It may very well be the case that Blanton has not been locating his pitches well. However, he will throw close to 3,000 pitches by the end of the season if he stays healthy. Blanton’s “poor control” to start this season will be washed out as the sample size increases (central limit theorem). If it doesn’t, then we do have a legitimate issue on our hands. Given our current super-small sample size, there is no reason to worry about Blanton’s terrible control yet.

  17. LTG

    April 21, 2011 04:18 PM


    I appreciate your response and apologize if I misunderstood your original post (although I maintain that snark just doesn’t belong here, even when provoked, and I don’t think Mick Shmitt’s comment was provocative either).

    To the point that it is not BABIP’s job to discern between luck and skill: right. From that we should conclude that arguments of the form BABIP is high/low compared to mean => pitcher is un/lucky are invalid without further analysis. That further analysis has to include consideration of the other factors (skill, defense, ballpark) as well as the on-field evidence provided by the pitcher’s performance. A pitcher’s BABIP could be high as a result of a series of outings like Hamels had against the Mets, or because the pitcher can’t locate his fastball as well as in the past (as seemed to be the case for Blanton until his last outing).

    The outlier comment was meant to suggest that variance in statistics like BABIP could be due to mere chance or to something else that has not yet been considered. In the interpretive process attributing error to the data is indistinguishable from misunderstanding the data unless there is a good reason to attribute error rather than offer an alternative explanation. Regression to the norm might be normal but it doesn’t always happen.

    “Pitch-tracking” refers to the apparent ability of teams and broadcasts to determine where each pitch was in the strike zone.

    Finally, regarding the “heavy math” comment (I can’t access the math anyway), how the math works (which I would understand fine) isn’t an interesting question. The interesting question is whether and how the math reflects the relevant phenomena. This depends on understanding the inductive move from a slew of data points to how the statistic captures it. That’s the question I was trying to ask above, and so even referencing the percentage attribution per BABIP-factor doesn’t quite do the trick (though I am willing to trust experts here).

  18. Phylan

    April 21, 2011 04:59 PM

    No offense to Bill, but I think he might agree that snark is decidedly not out of place on this blog.

    I don’t really understand your last bit. If we agree that about 12% of BABIP variance is attributable to skill, why is suggesting that Blanton’s .373 BABIP in the face of his .301 career BABIP is a product of bad luck invalid?

    Anyhow, as for the anecdotal location observations: as quick exercise, I peaked at the pitch F/X data of Halladay and Blanton. If you define “middle of the plate” vertically as the middle third of the average strikezone, and horizontally as the middle third of home plate, then Halladay has placed 27 of 452 pitches there (5.9%) and Blanton has placed 17 of 261 pitches there (6.5%).

  19. LTG

    April 21, 2011 07:22 PM

    On the question of the BABIP inferences, I was speaking generally and will concede that Blanton has been the victim of some bad luck this year. How much bad luck remains to be seen since we would like to assume that he is still the pitcher with a .301 BABIP mean (or is it .298? I think FanGraphs had .301.) but he might not be. What if that big upswing from last year is indicative of a trend toward ineffectiveness…

    On snark, I would argue that it is always an irrational response if your aim is to enlighten your interlocutor. Of course, if your aim is to take joy in superiority then it is perfectly rational. (It could be a joke among friends, but do you know everyone who posts here well enough to treat snark as playfulness?)

    Finally, I’d be curious to know whether the not huge differences in Oswalt’s K/9, BB/9, and GB% numbers correlate to the drastic difference in their value over the course of their respective careers. Any justified opinions you could throw my way would be appreciated.

Next ArticleOdds and Ends After Thursday's Win