Please Stop Calling Cliff Lee Streaky

A week after I wrote that the NL Cy Young might be a two man race, Cliff Lee appears to have pitched himself back into contention. The night after that post went live, Lee capped off a brilliant August with 8 and 2/3rds shut out innings against the Reds, and, on Monday, followed that up with his league-leading 6th complete game shutout of the year against the Braves. His numbers now fall right in line with those of Halladay and Kershaw, and it’s impossible to exclude him from any discussion about the NL’s best pitcher in 2011.

Lee’s dominance this season, at least to me, has seemed under-advertised by media and fans. It’s understandable, to an extent. Everyone breathlessly waits to see what Roy Halladay can do next, and for good reason. Cole Hamels is presently filling in the zeroes on his next contract with each gem of a start, and, anticipating that he’ll stick around, we want to feel out just how devastating his new repertoire can be, as if 2010 wasn’t evidence enough. Vance Worley is the new rookie surprise story, on a staff that was hardly wanting for reasons to watch. Added to all that, Lee began the season with some starts that were a strange mix of high strikeout and earned run totals, just when the expectations for the new mega-rotation were fresh and uncompromising.

All of these factors have contributed to a strange notion that I’ve seen in more than a few places: Cliff Lee is “streaky,” or “inconsistent.” I’m not making this up:

Dare to say it; Lee has been somewhat inconsistent this season, with two historic months of dominance surrounded by some fairly modest months of performance.

He is at times the best pitcher in the world, and during others he’s just another pitcher. If you look at his monthly splits this season, Lee has put in two months of ridiculous, epic and historic work.

I want to stress here that this is not meant as a dig at Philadelphia Sports Daily or Jim McCormick. He’s a very good beat writer — one of the best, and one of my favorites, actually. This was just the easiest example to cite. Bill Petti, a writer at Beyond the Boxscore whose work I also enjoy, wrote about it too. They’re simply elaborating on something that a lot of other people on blogs, twitter, radio, newspapers, and in broadcasts have said at some point or other this season. In most cases they’ve said it because of this:

It’s easy to see why someone would look at this data and conclude that Cliff has had, at the very least, a strange year. The June through August stretch is particularly schizophrenic, at least when measured by ERA. Of course, ERA never tells the whole story. His BABIP fluctuated wildly over that period, from .191 to .359 to .237. His strikeout rate was actually at its lowest during his incredible June, and was lower in his 0.45 ERA August than it was in his 4.18 ERA April. A simple results-based evaluation is insufficient; it’s much more complicated than the number of earned runs he has allowed from one month to the next. If we fade ERA out a bit, and add FIP and xFIP to the above graph, this becomes all the more obvious:

That smoothes things out quite a bit, doesn’t it? For one thing, the big split between his FIP and xFIP in July indicates that home runs allowed per fly ball was the source of his outcome woes that month, and indeed that metric was severely inflated, at 18.8%. If his ERA had fallen in line with his FIP or xFIP, no one would seriously accuse him of streakiness. The first graph now looks like a superficial take, at best. We can’t say for sure that there weren’t some perfectly good reasons for his BABIP and HR/FB fluctuations month-to-month (in particular his pop-up rate spiked heartily in his low-BABIP August), but, really, isn’t that the point? When you chunk data out into such small samples, you’ll end up with the murkiest of portraits no matter what brushes you use.

This is especially true when the criteria for that chunking is as entirely arbitrary as calendar months. The Gregorian calendar was rolled out by the head of the Catholic Church almost 500 years ago, primarily because the previous calendar had a nasty habit of shifting the Spring equinox further out of alignment with Easter each year. Baseball evolved gradually in 19th century America from a variety of ancestral stick-and-ball games. The two have nothing to do with one another. There is no reason that Cliff Lee’s pitching ability should have anything to do with the ambitions of Pope Gregory XIII, or the orbital mechanics of the Moon. As Twitterati member @Everybody_Hits noted a while back, you can redefine the calendar months and Cliff Lee’s “consistency” problem disappears. What if each month began on the 25th instead of the 1st?

Now, instead of the wild month-to-month sine wave, Cliff has had a great 4 month stretch from April 25th to August 24th, bookended by a decent March/April and two fantastic August/September starts. It’s impossible to call this an up-and-down season. Even in moving the endpoints, though, we are still submitting to the tyranny of the Moon, sticking with 30-day periods to define our months. Again, there is no reason why we should do this. It has just as much to do with baseball as migratory bird patterns and seasonal wheat harvests. So, hey, let’s break out Lee’s performance according to the rotation of Lambda Andromedae, a G-type giant binary star located approximately 84 light years from Earth. It happens to have a rotational period of 54 days.

Now, if I were to claim that Cliff Lee draws his pitching abilities from the machinations of a distant star, strengthening his powers with each full rotation, I’d have just as strong a set of empirical legs to stand on as those that would look at his monthly splits and call him streaky. In analyzing baseball, we’re constantly limited by the fuzziness introduced by small sample size even when working with a full season of data (especially for pitchers). Splitting it up further only amplifies the problem. Anyone who chooses endpoints, be it a fan, writer, or broadcaster, does so with a certain agenda in mind, whether they know it or not. Even from those endpoints that seem perfectly natural on their face — monthly splits being an excellent example — there can emerge great thickets of coincidence that masquerade as narratives. If we fail to apply the utmost scrutiny to these, we may allow single season gems like Cliff Lee’s 2011 to be muddied with baseless criticisms, and that would be a true shame.

Leave a Reply


This site uses Akismet to reduce spam. Learn how your comment data is processed.


  1. Santos

    September 07, 2011 07:17 AM

    I personally find it very hard to believe that Cliff Lee’s success has nothing to do with the seasonal wheat harvests. Maybe I’m just a realist.

  2. Nate

    September 07, 2011 07:55 AM

    I think this might be the best thing I’ve read about baseball ever. A truly fantastic job.

  3. Kellie

    September 07, 2011 07:59 AM

    Wonderful piece. I love the conclusion.

  4. ej

    September 07, 2011 08:21 AM

    well then why not look at the most obvious interval, starts? If you look at Lee’s runs per start, I am sure you will see the same type of “streakiness” that is displayed in the calendar month.

  5. Bill Pettti

    September 07, 2011 08:28 AM

    Good stuff, Bill.

    Jumping the gun a bit here, but I am getting ready to publish my Volatility findings for pitches (which is a little different from streakiness), hopefully some time this week.

    As far as Lee goes, yes the ERA streakiness thing is overdone. But in terms of how consistent he is, outing to outing, that is a different story.

    I used FIP to measure Volatility for starters, and from 2005-2011 Lee ranked 40th in Volatility (1.58) among those with >= 50 starts. This year, he ranks 48th out of those with >= 20 starts (1.30). Lee is more volatile compared to Halladay (1.21), who ranks 39th. Hamels ranked 12th with .92. Compared to the average (1.61), Lee is certainly above average.

    None of this is to say Lee is a bad pitcher, or unreliable. He’s got one of the best FIPs in the league, and he delivers consistently on that FIP outing-to-outing better than league average.

  6. Dave

    September 07, 2011 08:40 AM

    This is reall good stuff, Ryan.

    And I also agree with Bill Petti. I’m not sure how he’s measuring “volatility” but if you look at variance (i think fangraphs did this) in ERA using each start as a data point, you’ll see cliff lee is less consistent, i.e., larger variance, compared to Halladay and Kershaw. Basically driven by higher highs and lower lows. Still though, all around awesome.

    I’m a big fan of those last 2 graphs, and it looks like Lee is getting stronger as the season progresses. It would be interesting to see a graph of his rolling average ERA over 5 starts.

  7. Mike B.

    September 07, 2011 09:00 AM

    Agree with ej and Bill P. You are (by your own admission) simply using different arbitrary time periods to prove your point. Comparing volatility by start is the best way to get at the truth. I’m no sabermetric Luddite, but most people watching Lee intuitively know what Bill’s numbers above quantify — that the quality of Lee’s starts varies more than the other Phils starters. I’d also theorize that his “good” starts vs. his “average” starts tend to be grouped together, creating the appearance of “streakiness.”

  8. Richard

    September 07, 2011 09:05 AM

    Aren’t we really only talking about three bad starts (Atlanta, Washington, San Diego)? Maybe four, if we include the Toronto game with the three-homer meltdown inning.

  9. Ryan Sommers

    September 07, 2011 09:07 AM

    Thanks Bill. I take it lower = less volatile?
    I look forward to reading your full piece on them. If people want to use analysis like that to say that Lee is more volatile from start to start than the average pitcher, that’s fine with me. I don’t particularly care how volatile Lee is — if he produces 200+ IP with an ERA below 3, it doesn’t matter to me how he got there.

    Month splits are probably the endpoints that your average fan refers to most readily, though, and I think it’s clear they portray Lee (and plenty of other players) unfairly.

  10. Jay

    September 07, 2011 09:12 AM

    “if I were to claim that Cliff Lee draws his pitching abilities from the machinations of a distant star, strengthening his powers with each full rotation…”

    Why are you giving away Cliff Lee’s secret a**hole?!?!? You would probably also tell everyone that Clark Kent was Superman. I hate snitches.

  11. Bill Pettti

    September 07, 2011 09:16 AM


    Yes, lower = less volatile. And I agree, monthly splits get you into trouble, as does just focusing on ERA.

    Hoping to have the first article in the series up before the end of this week.

  12. Greg

    September 07, 2011 09:40 AM

    Great article. One of the hardest things to get over in any kind of analysis is personal (or societal) bias based on arbitrary numbers. Another good example is the whole round number thing. If human beings didn’t have ten fingers and ten toes, there probably wouldn’t be such a value put on 20 wins, 100, wins, etc.
    I also probably would win a lot more money in casinos, instead of saying “I’ll just gamble till I’m down to the next hundred…” sigh.

  13. sdphillie

    September 07, 2011 10:44 AM


    Cliff Lee has a secret a**hole?

  14. Mike B.

    September 07, 2011 12:47 PM

    @sdphillie: LOL…well done.

  15. Dustin

    September 07, 2011 01:09 PM

    Can we see Cliff Lee’s performance in relation to the movement of the Dagobah system?

  16. Jon

    September 07, 2011 01:30 PM

    this is like the 4th article I have read of yours….I have no idea what 3/4ths of the acronyms stand for…maybe a key would help? (don’t feel like going to google every 10 seconds).

  17. Lee

    September 07, 2011 03:53 PM

    best crashburn post to date. really well done, man.

  18. Cackalacky Mike

    September 07, 2011 05:03 PM

    Here’s a fun one: Who is the last starting pitcher to finish with an FIP lower than the 2.12 that Doc currently sports?

  19. Santos

    September 07, 2011 05:16 PM

    Pedro 99?

  20. Santos

    September 07, 2011 05:17 PM

    We’ll call that a tie, Bill

  21. Burt Lavallo, friend to all

    September 07, 2011 06:49 PM

    We can keep calling Ibanez streaky, though, right?

  22. Cackalacky Mike

    September 07, 2011 06:57 PM

    Pedro ’99 with a 1.39! The next best FIP that year was Randy Johnson, with a 2.76. Wow.

  23. Cackalacky Mike

    September 07, 2011 07:17 PM

    I admit, though, that I’m a partisan for Doc for the Cy Young. Even in what seems like a ho-hum year–without the shutouts, without the perfecto–he’s got a 2.12 FIP (career best), 2.61 xFIP (career best), 7.5 SO/BB (career best), and a league-leading 7.4 fWAR (if he can pick up another 0.4 WAR, he’ll tie his career high).

    I mean, you can’t go wrong with either of them, right? But for my money, in the biggest game of the year, I want the Doctor on the mound performing surgery.

  24. Phillie697

    September 07, 2011 09:14 PM

    I can’t believe it!!! Did the Phillies just win because Charlie out-managed an opposing manager by using his bullpen more effectively than his opponent?!?!? This is what, the third game Braves threw away to the Phillies by NOT pitching Kimbrel in a tie game this year? Thank god I’m not a Braves fan.

  25. SABR

    September 07, 2011 11:30 PM

    If Kimbrel pitches when it is tied then the Braves are guaranteed to lose the game, since no one else on the staff is capable of pitching in a save situation. Only a select few can handle the pressure that comes with a save.

  26. Phillie697

    September 08, 2011 12:28 AM

    I have no idea if that was trolling or serious…

  27. hk

    September 08, 2011 02:08 AM

    I’m pretty sure it was sarcasm from SABR. And, let’s not give Charlie credit for out-managing Gonzalez. Like Fredi, Charlie goes by the book that says the right strategy is to use your closer in a tie game at home, but not on the road. If the game had been in Atlanta, Charlie would have kept Madson in the pen while Herndon, Kendrick or Lidge would have lost it. It is amazing that managers still employ this flawed strategy – one would think that Torre’s use of this strategy in Game 4 of the 2003 World Series would have led to the book being re-written – and I fear that Charlie will use it in the post-season if given the chance.

  28. Matt H

    September 08, 2011 02:25 AM

    Ok, this post made me laugh pretty hard. But the calendar thing is just a fun fact that happens to illuminate a fundamental aspect of Cliff Lee’s pitching, one that Ryan shouldn’t minimize or hide. Let’s not troll our own beat writers!

    Cliff Lee has a higher variance in outcomes as measure by ERA than other elite starters. Is this luck? Is this a feature of his pitching style? Interesting question. I’m gonna go with the latter but could be convinced to change my mind by a good post on Crashburn Alley.

    I think the relevant variables will be how consistently swing/miss rates translate to Ks, and the tradeoff of higher BABIP on ground balls which invariably means higher variance (for BABIP under .500!) versus the HRs that eventually accompany flyballs. It is likely to me that at high xFIP the variance *might* be higher for the groundball pitcher, while at low xFIP (the type of pitcher we are examining here) the variance is almost definitely higher for the flyball pitcher.

  29. Rob

    September 08, 2011 01:50 PM

    Love the last 2 graphs – classic.

  30. Dan

    September 08, 2011 08:03 PM

    Nice post. However, as a Mets fan, even besides all the reasons I already hate the Phillies (let’s not go into it), the fact that the best pitcher I get to root for on a regular basis is R.A. Dickey while you guys have legit Cy Young contenders (and a close runner-up) is just… god.

    Your 5th starter would be our best pitcher. Kill me now.

  31. Matt

    September 09, 2011 02:58 AM

    Anyone noticed that as of 9/9/11, Hamels, Lee, and Halladay all have 28 starts and given up exactly 56 earned runs. Amazing.

Next ArticleDVD Contest: May 17, 1979 PHI @ CHC