Sam Harris on Statistical Concepts

Author, philosopher, and neuroscientist (what a combo!) Sam Harris appeared on the Joe Rogan Experience yesterday for a few hours of discussion. Towards the end, Harris and Rogan discussed some statistical concepts such as sampling bias, the hot hand fallacy, and probability. As this blog makes heavy use of Sabermetrics and statistical concepts in general, I felt that part of the discussion was quite enlightening.

You can watch the discussion below, or click the link for the transcript. Be warned that there is some strong language (not much), and a very brief discussion of religion, so use your discretion if you are easily offended.


Those points apply to baseball as well. There is a lot of sampling bias involved in justifying belief in clutch hitting, for example. The ninth-inning walk-off will always register more strongly in your memory than a ninth-inning ground out that sent the game into extra innings. When people justify calling a player clutch, they rattle off all of his clutch hits, but don’t put it within the context of total chances nor do they consider other factors that may have been at play.

The hot hand fallacy gets some play in baseball, especially when players go on hitting streaks. Jimmy Rollins went 1-for-6 in the first game of what would become a 38-game-hitting streak in 2005-06. If you had asked anyone within the first 13 games if Rollins was dialed in, they most likely would have affirmed. However, Rollins actually put up a lackluster .254/.318/.356 line. Rollins wasn’t any more likely to get a hit then than he was in any other situation at that time.

Rare events certainly drive baseball narratives. The Red Sox slipped out of post-season contention at the end of the regular season last year, and it was blamed post-hoc on the team’s consumption of beer and fried chicken in the clubhouse. Other teams have thrown away their post-season hopes in worse ways than the Red Sox, and it had nothing to do with beer and fried chicken. They went 7-20 (.259) in September, which is bad. But if you swapped their September with June (16-9, .640), the beer-and-chicken narrative goes away. It’s certainly not impossible for a .557 team to lose 20 of 27 games in a given stretch. Some of them happen early, some of them happen late, but it’s only when it happens late do we really pay attention and assign meaning to the occurrence.