Analyzing Strike Zone Data From the Statcast Database
With my Statcast database in Oracle nice and handy, let’s have some more fun and look at strike zone data. Strike zone data in Statcast is measured by two variables, plate_x and plate_z, which are like respective x and y coordinates of an axis. The strike zone is the width of home plate, which is 17 inches, although the rule book says specifically part of the baseball has to hit the strike zone for it to be a strike, so truthfully the strike zone is slightly wider than 17 inches. A baseball is three inches in diameter, so really the strike zone width is a number slightly smaller than 23 inches. Statcast measures plate_x and plate_z in feet, not inches. The middle of the strike zone width is a plate_x of 0 (like an axis origin of (0,0)), with pitches to the left being a negative value of plate_x and pitches to the right being a positive value of plate_x. For example, the strike zone is 17 inches wide (1.417 feet), so the left edge has a value of -0.7083 and the right edge has a value of 0.7083. I will repeat though that the strike zone is technically a little wider than that. Plate_z meanwhile is much easier to interpret; it is simply how many feet above the ground it is. While every batter has the same strike zone width, every height is unique. Because of this, Statcast records a plate_z value for the strike zone top (sz_top), the midpoint between the belt and shoulders, and bottom (sz_bot), the bottom of the knee cap.
Now let’s get into some fun querying. What were the lowest pitches that were called strikes?
The lowest called strike was thrown to Michael Brantley and
was 0.93 feet (11.16 inches) off the ground. That’s pretty low and just one of
four pitches lower than one foot that was called a strike.
This list of ten has a few carryovers from the previous one,
but it does give a better answer to the question. Brandon Drury wins here,
where he had a pitch 8.64 inches below his strike zone bottom called a strike.
Here’s a clip of it, and if you guessed that he was upset with the call then
you’d be correct. Aaron Judge appears in this list, which isn’t surprising
considering it’s become a thing now that low strikes get called on him due to
his big height. In his case in the table above, his strike zone bottom was over
two feet above the ground. Here’s a clip of his pitch from the table.
So what about the opposite? What were the pitches closest to
down the middle that were called balls? I calculated this using the distance
formula of SQRT[(x2 - x2)2 + (y2 - y1)2] between the points
of the pitch and the middle of the strike zone on that pitch (plate_x of 0 and midpoint
of sz_top and sz_bot).
The first two pitches really grab my attention—how could a
pitch less than three inches from down the middle get called a ball? Let’s watch
what they look like. Here is the first pitch and here is the second pitch. Yep,
both are right down the middle, but the catcher was certainly crossed up and
not expecting a curveball, causing him to frame it in a funny way.
Which batters had the highest bottom of the strike zone?
This one’s inspired by Aaron Judge.
You’ll notice most of these batters aren’t really batters;
they’re really tall pitchers called up to bat. Also, when you picture a pitcher
batting, they always seem to be standing upright, just waiting for the at-bat
to get over with. Here is what the 6’7 T.J. Zeuch looks like up to bat. Just as
expected.
Lastly, let’s look at how the strike zone expands based on
the count. Foolish Baseball came out with a video recently analyzing Aaron
Judge’s strike zone (I swear I’m not copying him) and he uses a strike zone of
width -0.8 to 0.8 rather than -0.7083 to 0.7083 to account for pitches that
partially touch the strike zone, like I described at the beginning. Foolish
Baseball added about 0.1 feet to each side of the strike zone, which could have
been closer to the 0.25 foot width (3 inches) of a baseball, but I’ll use 0.1
foot additions in my calculations like he did. Similarly, I will had 0.1 feet
to the bottom and top of the strike zone.
It’s always said that the strike zone changes on a 3-0 count
and an umpire will be more generous to give a pitcher a strike call by expanding
the zone a bit. Is this true? How do other counts behave?
We can see there’s a clear correlation between how friendly
a hitter count is and how often a should-be ball is called a strike. The 3-0
count is by far the friendliest strike count. As a strike is added from 0-0 to 0-1
to 0-2, the percentage trends downward. It’s true; enforced strike zones depend
on the count.
Comments
Post a Comment