“The incredible record of Joe DiMaggio in the summer of 1941 is unparalleled. No one has come close—before or since—to equaling his streak of hitting safely in 56 games in a row.”
So begin Steve Strogatz and Sam Arbesman from Cornell University in their paper discussing the likelihood of DiMaggio’s record.
“People have…stated that it is the only record in baseball (or perhaps even in all of sports) that never should have happened, statistically speaking: while other records can be explained by expected outliers over the long and varied history of professional baseball (nearly 150 years), DiMaggio’s record stands alone”
But as with so many statistical assumptions, a proper analysis can reveal counterintuitive results, say Strogatz and Arbesman. The pair have modelled the phenomenon of hitting streaks using a number of simple models and guess what…DiMaggio’s record is not as unexpected as it looks.
The models suggest that while a DiMaggio-like record is unlikely in any given year, it is not unlikely to have occurred about once within the history of baseball.
But having plugged the statistical performance of a number of players into the model, DiMaggio is not the most likely to have picked up such a record. That honour goes to one of Ross Barnes, Willie Keeler or Hugh Duffy (there is no single most likely player). DiMaggio, it turns out, is 47th most likely player to have reached the record in one of the models used.
More curious is why Strogatz, widely considered to be the father of the small world network theory, has taken up the baton in examining baseball statistics. He joins a small but select group of physicists and mathematicians with a passion for the game including Gene Stanley and Percy Diaconis.
So what’s next? Surely the task now is to find a record that defies statistics in the sense that it is truly unlikely. Let me be the first to suggest Don Bradman’s 99.94 batting average in test cricket.
Ref: arxiv.org/abs/0807.5082: A Monte Carlo Approach to Joe DiMaggio and Streaks in Baseball