How Predictive Are Statcast Metrics?
Intro
Baseball is a hard sport to predict because of all the
random variation in it, and that alone probably accounts for like half of sabermetrics.
I was curious about how predictive Statcast metrics are and it got me inspired.
For this assignment, I looked at the 2017-2021 player-seasons that had at least
200 plate appearances. A lot has been written about how long it takes for Statcast
metrics to stabilize, and it turns out they take much, much shorter than other
stats like batting average. I’ve read that Statcast metrics stabilize within 70
batted balls (some by 40 even), which a player will easily achieve by the time
he’s reached 200 plate appearances. This gives me a sample size of 1548
player-seasons. Statcast’s main metrics are the batted ball ones, such as average
and max exit velocity, considering those are the ones that actually need Statcast
in order to exist, but any player’s Baseball Savant dashboard will also show
plate discipline metrics like out-of-strike-zone swing percentage (AKA chase
rate) and swings-and-misses-on-swings percentage (AKA whiff rate).
This is Aaron Judge's Statcast dashboard. |
How Did Statcast Metrics Change in 2021?
Did Statcast metrics change in 2021? We do know one thing: the baseball was different this year. In recent years, there had been lots of talk about the baseballs being “juiced” for more distance and home runs (though Major League Baseball tried to pretend otherwise for a while).
The new baseball didn’t seem to have too much of an effect on home runs: in 2020, 3.5% of plate appearances resulted in home runs, while in 2021, that number only dropped to 3.3%. Even if the home run rate did not change, there still could have been an effect on Statcast metrics, and I want to make sure that I’m comparing apples to apples when comparing 2021 to years prior. The plan for the new baseball, according to the Athletic, was “reducing the weight of the ball by less than one-tenth of an ounce, and also a slight decrease in the bounciness of the ball.” One comparison I heard (unfortunately I can’t remember the source) was comparing the old baseball to a basketball and the new, lighter, and less-bouncy baseball to a balloon. Hitting a basketball would have a low exit velocity but travel a further distance, while hitting a balloon would have a higher exit velocity but a shorter distance. Did the distributions of the metrics reflect this?
In 2021, it looks like the frequency of higher max exit
velocities was even higher—which supports the prediction that the higher-drag,
lighter ball would have higher exit velocities. All in all, the distribution is
pretty similar. The same can be said about the average exit velocity distribution,
where the 2021 line looks identical to the 2017-2020 one, just slightly shifted
to the right. Hard hit rate is the percentage of a player’s batted balls that
are 95.0 mph or faster, so it is no surprise that similar patterns are also
present in its distribution comparisons. Barrel rate is a similar metric. Instead
of looking at batted balls at a specific exit velocity, it looks for batted balls
of specific exit velocity and launch angle combos that have produced at least a
.500 batting average and 1.500 slugging percentage in the Statcast era. Once
again, its distributions look identical, but 2021 slightly shifted to the right
to reflect a small spike in exit velocity. Barrel rate is an interesting one to
look at because it deals with more than just exit velocity. Because of the
similar shape of the compared distributions for these four metrics, and because
the increased shift in 2021 is still relatively tiny, I do think it is fair to
lump in 2021 stats with 2017-2020 stats. However, it is important to keep in
mind year-to-year changes are present. xwOBA, meanwhile, has stayed the same,
which to me makes sense because it is more of an all-encompassing stat.
R-Squared (Coefficients of Determination)
R2 is a number that quantifies what percentage of
variability in the y variable can be explained by the effect of x variables. When
used between just two variables, it can be calculated by squaring the Pearson
correlation coefficient between the two variables. First, I looked at which metrics
were the strongest determinants for their next year values. So for example, the
first value shows how strong of a determinant a player’s average exit velocity
in the current year is to his next year’s average exit velocity.
The big takeaway here is that batted ball metrics have much stronger
year-to-year R2 values than standard stats like batting average and
slugging percentage, and even wOBA. Of the four batted ball stats, max exit
velocity is the most powerful year-to-year, which makes sense to me as that
seems more like a true skill stat than a performance stat. The highest R2 values,
however, are in the plate discipline metrics, showing that most players really
are who they are when it comes to their plate discipline habits.
Which metrics are the strongest determinants to how well the
player performed in the current year? I use wOBA as the rate stat for overall
offensive output. There are a few takeaways here. For the batted ball metrics,
max exit velocity is by far the lowest, which makes sense again because it is
more of a skill stat than performance stat. Barrel rate is the highest, which
feels right to me because I view barrel rate as the most informative of the
four. Another take away is that strikeout rate has almost no strength to it,
yet walk rate has a notable R2. Slugging percentage is much higher
than batting average, which is to be expected considering wOBA is weighted (a
double is worth more than single, and so on) and batting average carries a lot
of random variation with it.
Lastly, we have the most important set of R2 values
that we came here for: the relationship between metrics and a hitter’s wOBA the
next season. The biggest takeaway here for me is that xwOBA has a
stronger value than wOBA itself—clearly the inventors of xwOBA are doing a good
job of estimating true talent level. Meanwhile, plate discipline metrics and
batting average are very weak. Slugging percentage is much higher than batting
average, which is to be expected considering the randomness that comes with
batting average. The most surprising thing for me is seeing which batted ball
metrics are stronger than others. My prediction would have been that barrel
rate would be the strongest determinant of next year wOBA, considering its
definition seems to adhere the most closely to simply how often a batter hits a
ball well, yet it is lower than both hard hit rate and average exit velocity.
Personally, I’ve always ignored average exit velocity compared to barrel rate
because by definition it seemed less informative, so I’m surprised to see it
higher and find this very insightful. Meanwhile, none of the batted ball stats
are stronger determinants than slugging percentage, and it’s neat to see a
non-batted ball, non-expected, matter-of-fact, old school number performing
better.
Machine Learning
I was curious how accurate
these stats could predict player’s next year wOBA using machine learning. In my
algorithms, I included 11 variables as features: plate appearances, walk rate,
strikeout rate, average exit velocity, barrel rate, max exit velocity, hard hit
rate, chase rate, xwOBA, expected batting average, age, and whiff rate. I
wanted to include plate appearances because I figured seasons with higher
sample sizes could carry more weight, and I wanted to include age because
players entering or leaving their prime could have different behaviors in their
next year wOBA. As a baseline, I predicted every player in the test dataset as
having an equal next year wOBA as their current year wOBA. This produced a root
mean square error (RMSE) of .044. A variability of .044 is significant in the
context of wOBA because it is measured in the thousandths—the difference
between a .300 wOBA hitter and a .344 wOBA hitter is pretty large. Can a
dataset that has undergone machine learning perform more accurately?
For this assignment, I tried four different machine
learning algorithms: decision trees, decision forests, support vector machines,
and neural networks. Each algorithm produced similar RMSE results, with the decision
forest the best at .036. While this is more accurate than the baseline, it is
still disappointing to not see it any lower. It really goes to show how hard it
is to predict performance the next season.
When looking at which features were most important in
the decision forest model, it’s no surprise to see xwOBA at the highest,
considering we saw its R2 value being the strongest before.
Something interesting to look at is the physical decision tree created by the decision tree algorithm, which can be seen above. The variable names in the feature importance plot and decision tree correspond like so: X0- plate appearances, X1- walk rate, X2- strikeout rate, X3- average exit velocity, X4- barrel rate, X5- max exit velocity, X6- hard hit rate, X7- chase rate, X8- xwOBA, X9- expected batting average, X10- age, X11- whiff rate.
To complete this project, I used data from Fangraphs, Baseball Savant, and the Chadwick Baseball Bureau. Everything was completed in RStudio, and the machine learning was completed in Python Jupyter notebooks. Special thanks to Dr. Varol Kayan, whose Python scripts taught in class were heavily borrowed to complete this project. You can my created files for this project on GitHub.
Comments
Post a Comment