Reader Email: Two-Out Runs

Reader Matty sent in this email:

Hi Bill,

I just came across your blog and was looking at all of the statistics you monitor. One of the things I’ve noticed for a long time but that I never hear anyone talk about is runs allowed after retiring the first two batters in an inning.

As a lifelong and diehard Phillies fan, I get so frustrated when a pitcher fails to get the third batter in the inning after retiring the first two. In fact, it happens so often that I’m surprised that the sports radio talk shows and TV analysts don’t ever mention it. Over the years, I have seen so many promising innings turn into a nightmare for the Phils, and it all starts with two outs and no one on base. The inability of a pitcher to get that third hitter after retiring the first two has lead to more runs than I can remember. Once that third batter reaches base, the flood gates open and it’s one run after another. In fact, when a Phillies pitcher gets the first two batters out, I often joke that the other team has them right where they want them.

You might remember the 2009 world series at-bat by Johnny Damon in the 9th inning. Lidge retired the first two hitters, and had Damon down 0-2 in the count. Lidge then gave up 3 runs. Yes, two outs and no one on base, and an 0-2 count on the third batter……and allowed 3 runs! The rest is history.

Oswalt did it earlier this season. Got the first two batters in an inning and then couldn’t get anyone out after that. Allowed 3 runs on five straight hits after having two outs and no one on base.

And that’s just two examples out of many. I’m sure it can’t be me. There must be others who notice this, but yet I never hear anyone bring it up.

So I’d love to hear your take on this Bill. Is there a way to monitor this? A list of pitchers and their runs allowed after retiring the first two batters in an inning?

Matty

Two out runs are painful, aren’t they? You can see the finish line just inches away and then it all starts to fall apart. As a fan, it is certainly frustrating to watch.

When we objectively analyze statistics, though, we need to separate our emotions from the conclusions we are trying to reach. Is a two-out run really worse than a run scored with zero outs or one out? It feels different, but it holds no extra weight.

But Matty also suggests that pitchers have a legitimate skill in preventing those two-out runs. As with other concepts such as clutch hitting and lineup protection, there hasn’t been any evidence that points to this phenomenon existing. Furthermore, one needs to think about the quality of the data that would be involved in such a study.

If you click here, you will be directed to the portion of Roy Halladay’s splits that portrays his performance based on the number of outs in an inning. It just so happens that Halladay performs best when there are two outs. We need to ask ourselves many questions here, the first of which is, “Is it meaningful?”

Converting Halladay’s performance in the various out-states into wOBA, we come up with .314 with no outs, .293 with one out, and .290 with two outs. The National League average wOBA for the respective out-states is .321, .317, and .303. Converting Halladay’s wOBA into runs, Halladay has been 21 runs above average with no outs (3,493 PA), 67 runs above average with one out (3,213 PA), and 35 runs above average with two outs (3,086 PA). If we prorate that to 300 PA (generally how many PA a pitcher has in each situation in a given season), Halladay is 1.8, 6.3, and 3.4 runs above average, respectively.

No matter which way you slice it, Halladay’s performance with two outs is not meaningful in any way. Cliff Lee has a similar split, as does Roy Oswalt. If you can think of a starting pitcher that is significantly better with two outs than with zero outs or one out, I would be glad to take a look (assuming we are dealing with an appropriately-sized sample), but at least as it pertains to the Phillies, no one gains or loses effectiveness based on the number of outs in the inning.

Additionally, there will be various forms of selection bias at play, which affects the reliability of the data:

  • Veteran pitchers tend to finish innings (meaning they are prone to more one- and two-out situations than younger pitchers)
  • Pitchers with good reputations tend to finish innings (semi-related to the above)
  • Pitchers that are currently pitching poorly are less likely to get further outs (due to their own incompetence or the manager’s refusal to let them pitch further)
  • Pitchers with better defenses will face more one- and two-out situations
  • National League pitchers will face more one- and two-out situations due to the lack of a DH and the presence of the pitcher in the lineup, as well as the increased use of sacrifice bunting
  • For relievers, some tend to start innings while others tend to come in with runners on base and, if they’re lucky, one or two outs already recorded.

More reasons for selection bias abound, almost all of them need to be considered before taking the data at face value.

Finally, as the above chart illustrates, the Phillies really aren’t any worse on the whole than other teams with two outs. Their .672 OPS allowed with no outs ranks 13th out of 16 (NL average is .729); their .652 OPS allowed with one out ranks 14th out of 16 (NL average is .717); and their .679 OPS with two outs ranks 9th out of 16 (NL average is .681). The standard deviation on the OPS allowed ranges from about .040 to .060. So, the most certain thing you can say about the Phillies’ performance is that their true talent level with two outs is somewhere between .640 to .710. And that’s just the lazy way to get a feel for the talent range.

All of the above really just tells us that you can’t draw any conclusions about talent based on out-state, especially for individual players in individual seasons. If players do have a legitimate skill that makes them better or worse based specifically on the number of outs in an inning, then we need to see evidence for that. Given our current data, we cannot yet draw any conclusions other than fail to reject our null hypothesis, which is that there is no discernible difference between our various sets of data.