View Single Post
  #1146  
Old 11-21-2021, 11:32 PM
Snowman's Avatar
Snowman Snowman is offline
Travis
Tra,vis Tr,ail
 
Join Date: Jul 2021
Posts: 1,958
Default

Quote:
Originally Posted by cardsagain74 View Post
The article did have some good points, but I agree that its whole "FIP is all that matters" conclusion is too simplistic and goes too far. And some of the points were really grasping at straws; the quotes from Maddux and Pedro were an especially poor attempt to help prove the merits of the study (of course a long scoreless innings streak will have a lot of luck...what does that have to do with that specific discussion?)

I've noticed that when it comes to sports and gambling, statisticians love to claim as many "this is completely random" findings as they possibly can. A lot of that probably has to do with being the devil's advocate about the general public's often faulty attempts to find reason in trends or insufficient statistics.

And with having such a passion to do so, it's easy for them to go too far in the other direction (and be too quick to dismiss the possible meaning in some numbers)
The underlying problem is that every statistic you read really should come with a confidence interval attached to it. But of course that's just too confusing for most people, and it would probably just annoy everyone. Plus, it's just impractical. But the reality for most of these statistics is that they are actually estimates of the athlete's underlying "true" abilities. Mike Trout's "true" batting average is some unknowable number, but we can estimate it using statistics. And that's precisely what we do. After the first game, he goes 3 for 4, we estimate it to be 0.750. Well, that's not going to fool anyone, because nobody hits 0.750, so we wait for more data. After a month, he's still hitting 0.414 though. Hell, by the all-star break, he's still hitting 0.392. That's after nearly 100 games and 400 at-bats! Surely, that's a large sample, right? Has he turned a corner? Rumors start spreading about him "putting in work in the off-season". They say he's "really focused now", etc. But none of this fool's the statistician, because we don't read his batting average as 0.392. We understand that 0.392 is just an estimate of his "true" batting average and that we can calculate a 95% confidence interval around this estimate by looking at the standard deviation and sample size associated with it. So, instead of reading it as being 0.392, we more appropriately read it as something like 0.392 +/- 0.130. In other words, his "true" batting average is 95% likely to be in the range of 0.262 to 0.522, which ultimately, just isn't all that helpful. Because we know this, we are hesitant to say things like "Trout is a better hitter this season than Harper since Trout is hitting 0.392 and Harper is only hitting 0.333 at the all-star break". The truth is, we just don't have enough data to make that determination. The sample sizes are simply too small, the standard devaition is too large, and thus the confidence intervals are too wide to be able to make claims "with confidence" about that statistic.

The same is true for something like ERA from season to season. It is a highly volatile statistic. When we say something like "it has too much variance", we mean that literally. Mathematically speaking, variance is the square of the standard deviation. Some statistics have extremely wide standard deviations, like ERA, batting averge, OBP, etc. Whereas other statistics have MUCH lower variance/standard deviations. Stats like FIP vary far less than ERA. This means we can compare two pitchers at the all-star break with much greater confidence by comparing their FIPs than we can by comparing their ERAs. It is a mathematical property of the inherent differences between those statistics. The same is true of K/9 and BB/9. They have lower variance than ERA, and thus have much narrower confidence intervals. A statistician might be able to read Koufax's K/9 rate at the all-star break with a fairly narrow confidence interval because of this. So they might read his K/9 of 10.1 as being something like 10.1 +/- 0.4, making comparisons against other pitchers much more possible. If two pitchers' statistics do not overlap when taking into consideration their confidence intervals, then you can say that you are 95% confident that Koufax is a better strikeout pitcher because his 10.1 +/- 0.4 K/9, or as an interval, read (9.7, 10.5) is greater than some other pitcher whose K/9 confidence interval is (8.8, 9.6). Note the bottom of Koufax's range (9.7) exceeds the top of the other pitcher's range (9.6), so we can state with confidence that he is indeed better. However, this is rarely possible to say with ERAs. The confidence intervals with those are just absolutely massive. Even after an entire season. One pitcher's ERA of 3.05 may look quite a bit better than someone else's 2.64, but we just can't state that with confidence because their intervals might be something like 3.05 +/ 0.65 and 2.60 +/- 0.75 resulting in ranges of (2.40, 3.70) and (1.85, 3.35). And since those intervals overlap, we cannot state with confidence that they are truly different or that one is clearly better than the other. This is why an asshole like myself says something along the lines of, "that doesn't mean shit", whereas someone more tolerant might say something like, "the standard deviations of that statistic are too wide and the sample sizes are too small for us to be able to make a determination about the differences between those two data points". One of the most fascinating aspects about baseball, which is probably a big part of why I love the game as much as I do, is that the game truly is subject to a MASSIVE amount of variance. Great hitters can hit 0.348 one season and 0.274 the next. People will come up with all sorts of explanations about what is causing the slump, whether his home life is affecting him too much, if he's injured or just experiencing a mental lapse, etc. However, the informed fan knows that this is simply within expectations, and looks to statistics like BABIP to help shed light on what the actual underlying cause is (the guy just got some lucky bounces last season and some favorable ones this season. Or perhaps he didn't. Perhaps his BABIPs are the same, and there actually really is something going on in his personal life or he really is injured. But variance/luck needs to be ruled out first, because if it's present, then you already have your answer). This is also precisely why I stated earlier that I see no reason to believe that Randy Johnson was tanking games in Seattle in 1998 before being traded to Houston that season. At first glance, his numbers appear to tell a significantly different story (ERA of 4.33 in Seattle and 1.28 in Houston). But when you dig in closer and look at the confidence intervals associated with those deltas, and look at his FIP, K/9, and BABIP values, and the confidence intervals around those, you'll see that they all overlap. We simply don't have enough data to say that those numbers are truly different, even though they certainly appear to be, and read that way to the non-statistician.

But these things do in fact matter. This isn't just some statistician's "opinion". We can actually calculate these things mathematically. We can also calculate the precise probability that pitcher A will have a lower ERA than pitcher B by the end of the season based on their differences at the all-star break. And if the formula says that pitcher A is 50% likely to have a higher ERA than pitcher B, based on their current ERAs and the confidence intervals associated with them, and if we run those comparisons for all pitchers in the league, we really will be "wrong" on 50% of them at the end of the season because these confidence intervals are real-world probabilities that will play out in the future. That's the beauty of the discipline of statistics. It's all based on sound theory that has been proven mathematically.

Last edited by Snowman; 11-21-2021 at 11:38 PM.
Reply With Quote