Every so often, you will be given a statistic featuring a team’s record when a specific player does something. It is most commonly featured in the NFL. For instance, when Ray Lewis sacks the quarterback, you will very quickly be given the Baltimore Ravens’ record when Lewis records a sack. When the numbers are abnormal, they become their own narrative, and writers and fans alike run wild with them. Recently, the stat has been imported to baseball. The popular one floating around now shows that the Phillies are undefeated when Hunter Pence records a hit.
David S. Cohen of The Good Phight debunked a similar stat a few years ago, but I decided to take it a couple steps further and include the Phillies’ winning percentage when each player scores a run, records a hit, hits a home run, or drives in a run. (Click to enlarge)
Upon looking at the various charts, you should notice a couple things. One is that the players with fewer plate appearances (or fewer events recorded) are on either side of the players who get regular time in the lineup. This is the effect of a small sample size: the variance is much higher. Once the sample becomes larger, players tend to cluster around the mean.
Secondly, the records reflect the importance of each event as well as sampling bias. There are a lot of high winning percentages when a player hits a home run because home runs are the most potent event in baseball, on average. Additionally, players hit home runs off of either bad pitchers or good pitchers not pitching at their normal level (example: the Phillies hit four homers off of San Diego Padres pitching on July 23). As a result, the game is more easily winnable than normal. When teams face good pitchers, they tend not to hit home runs and lose more games.
What the “Team Record when Player X Does Something” stat tells you is… nothing. It may look like the Phillies’ .933 winning percentage when Raul Ibanez hits a home run is vastly superior to that of Carlos Ruiz and his .750 winning percentage, but Ibanez has only homered in 15 games. The winning percentage differential of .183 only accounts for roughly three games. When you consider the hefty amount of variance with such a small sample, it comes out as not being meaningful in the least.