An Analysis of Nearly 4 Million Pitches Shows Just How Many Mistakes Umpires Make
No doubt in my mind that balls and strikes will eventually be fully automated. The technology already exists, is being tested in some minor league parks and the writing is on the wall:
And this will continue as AI technology improves. All the calls that require extraordinary human perception (requiring the umps to be exceptionally gifted) will eventually be done far more efficiently by high-speed cameras, sensors and AI-based algorithms calling the shots.
Baseball is back, and fans can anticipate another season of amazing catches, overpowering pitching, tape-measure home runs - and, yes, controversial calls that lead to blow-ups between umpires and players.
Home plate umpires are at the heart of baseball; every single pitch can require a judgment call. Yet ask any fan or player, and they'll tell you that many of these calls are incorrect - errors that can affect strategy, statistics and even game outcomes.
Just how many mistakes are made?
Comprehensive umpire performance statistics are not readily known, tracked or made available. Major League Baseball doesn't seem interested in sharing the historical data.
Could it be because the numbers aren't flattering?
Luckily, every MLB pitch is tracked and made available - numbers then have to be accessed, downloaded, sorted and evaluated. This takes time and computing power. In a study with support from a team of Boston University graduate students, we closely analyzed how many balls get called strikes and vice versa. The accuracy of all home plate umpires was ranked and age and experience taken into account.
While the human element of the game certainly adds color, our results show that it comes at a high cost: far too many mistakes.
Mining the Data
All 30 Major League Baseball stadiums are outfitted with triangulated tracking cameras that follow baseballs from the pitcher's hand until it crosses home plate. Ball location can be tracked up to 50 times during each pitch, and accuracy is said to have a margin of error of one inch. This information is used to evaluate players, but MLB doesn't share the results in a way that allows fans to easily evaluate the performance of umpires.
We analyzed nearly 4 million pitches over the course of the last 11 regular seasons. This data, which had been collected by MLB-owned Statcast and Pitch f/x, was sorted, formatted and superimposed on a standard strike zone map.
An example of balls and strikes superimposed over a strike zone from a 2010 game between the Boston Red Sox and Toronto Blue Jays. The red points were called strikes, and the green points were called balls. Credit: Pitch F/X.
Using this available technology, we measured ball and strike calls for accuracy. We then ranked the error rates for each active umpire, creating a "Bad Call Ratio." The higher the ratio, the worse the umpire.
The findings were troubling.
Botched calls and high error rates are rampant. Between 2008 and 2018, MLB home plate umpires made incorrect calls over 12 percent of the time. In the 2018 season, MLB umpires made 34,246 incorrect ball and strike calls for an average of 14 per game, or 1.6 per inning. In the 2018 season, 55 games - 2.2 percent of the total played - ended with an incorrect call.
When batters had two strikes, the error rate for all umpires increased - incorrect calls happen 29 percent of the time, almost double the error rate when the batter had one or no strikes.
We also found that the highest error rates did not come from younger, less experienced umpires; they came from the older, veteran umpires. The average MLB umpire is 46 years old, with 13 years of experience. But the top performers between 2008 and 2018 had an average age of 33 years old and had less than three years of experience at the big league level. Like professional baseball players, professional umpires seem to peak at a certain age.
Despite years of data-driven evidence, MLB has notoriously resisted retiring poorly performing umpires and hiring better-performing ones. The league remains top heavy with aging umpires, making it difficult for fresh new talent to make impact.
Umpires Can Still Play a Role
For all of the ways MLB has incorporated technology into the game - the radar gun, instant replay, pitch graphics, Doppler radar - the league has resisted deploying this technology to assist with calling balls and strikes.
Umpires continue to call balls and strikes like they did a century ago when Babe Ruth played.
I'm not proposing that baseball bring in robots and fire the umpires; baseball has too many one-off situations and complexities to assume a bot could replace an umpire. But MLB does have a unique opportunity to use existing technology and strengthen human-software collaboration so umpires can do a better job.
Atlanta Braves outfielder Ender Inciarte argues with home plate umpire Doug Eddings after striking out against the Arizona Diamondbacks. Credit: AP Photo / Ralph Freso.
Umpires could easily be fitted with ear pieces connecting them to a control center that conveys real-time ball and strike information. These tech-assisted umpires could then make calls correctly, quickly and effortlessly. Time-honored and much beloved behind-the-plate signs, signals and sounds would still exist. And umpires could remain the final arbiter, having override ability under certain circumstances, such as if a ball hits the ground before crossing the plate or if a system outage occurs.
Strong recruiting, hiring and retention of superior performing umpires coupled with tech aids would reduce error rates and also help dampen biased pitch calling. Strike zone subjectivity would be minimized, allowing batters and pitchers to focus more on their craft and less on guessing a specific umpire's strike zone quirks. It would also reduce conflict between teams and umpires. And imagine how much the player and fan experience would improve if more than 34,000 annual incorrect calls vanished.
Mark T. Williams is the James E. Freeman Lecturer in Management at Boston University.
Tags
Who is online
448 visitors
I like having umpires because there are many judgment calls in baseball. But there are also quite a few areas where human senses are simply no match for existing technology (and the technology will keep improving). One obvious example is the inability of an umpire to see a play with 360 degrees visual sense or to hear the sounds of ball in glove, foot on bag, etc. as well as directed sensors designed for that task.
They should have a team working home plate, one who stands there just like today and announces the call and another in a room watching the cameras and sensors who wirelessly tells the other what the call will be. I never trusted Umps because they are sports nuts and sports nuts can never be 100% impartial even if they don't realize it themselves. With a system like this there'd be no excuse for bad pitching calls and there'd be a paper trail to keep them honest.
No need for people in another room watching, that function would be performed by the ABS technology.
The home plate umpire certainly has more to do than simply call balls and strikes. So having the umpire behind home plate still makes sense. Not only can he handle normal play calls, he can announce the balls and strikes as before with all the gestures and theatrics. The only difference is that he is not calling strikes, just announcing them.
Of course he would still make strike calls in situations where the batter makes a swing (or a failed held back swing) at a ball, calls foul tips, etc.
I have seen this in professional games as well as in many games I've played in various adult leagues, and in the high school and Babe Ruth Leagues where I have coached. I am convinced that the reason is that umpires are just itching to "ring you up", and the theatrical show that goes along with it. I'll bet the incorrect calls with 2 strikes on the batter lean numerically heavily toward a called strike that wasn't in fact a strike.
The most egregious bad call in the modern era that I recall was Jim Joyce's "safe" call at first base robbing Tigers pitcher Armando Galarraga of a perfect game on what would have been the final play of that game. Galarraga would have been the first Tiger in franchise history to throw a perfect game.
So not all mistakes are made by the plate umpire. Joyce after seeing the film in the locker room admitted that he "kicked the shit out of that call" .
Afterward, Galarraga much to his credit simply said,
Really cool moment during the next game when Galarraga actually brought the lineup card out to Joyce who was the plate umpire for that game.
Another obvious place for sensor technology. Coordinating ball touching glove with foot touching bag on first base is clearly better with technology. The camera replays alone prove this.
Indeed. I think this play, among others, is what tipped the scale in favor of expanded video review in 2014.
That'd be a complex system, as the first baseman is also putting pressure on the bag. He may come off of the bag to make a catch. Maybe he catches the ball down the line, and tags the runner. Maybe he juggles it, and doesn't have control when the runner crosses.
I think we have tread this ground before TiG. I'm a bit of a romantic when it comes to this. They may, in fact, come up with a foolproof system...I'm out. Umpires are, in my opinion, as much a fixture of the game as home plate, the mound, foul poles, etc. Part of the beauty of the game is, again to me, the different looks each individual ump may give you.
34,246 sounds astronomical. A full season of baseball includes 2,430 games. So, according to their numbers, the umps missed 14 calls per game. Did they give the pitch count? Average game is around 300 pitches. That's 729,000 pitches per year. Assuming 2018 saw 300 pitches per game, that's 4.69% missed, meaning they got it right over 95% of the time. To me, instead of bitching, the players should be singing praises. Missing less than 5% of the calls seems pretty damn good. Even the tracker above looks egregious, but the ones around the box are 1-2" out. I assume they are showing center of the ball. Diameter of a ball is roughly 2.9". I'm not sure the ones around the box didn't catch the plate. As to the outliers...yeah, those would be out by about 7". Clearly bad calls, assuming those are not breaking balls marked at the point the catcher received the ball. Either way, the thing I glean from the chat is this. Don't leave it in the ump's hands, especially if you are a Bluejay, playing in Boston.
If the automated system can weed out those breaking balls or pitches biting the corners that fool the umpire, with the help of a nice frame job by the catcher, imagine the impact on how the catcher might approach his job in terms of receiving the pitch. No more need to frame the pitch, nor "getting around the ball". All part of the traditions of the game.
That's one of my complaints, too. The plate is 17" across. Those little markers, are what? 1/2 an inch? You can have 99.9% of the ball miss the plate, but under the rules, that 0.01% makes it a strike.
Not sure why this is hanging you guys up. You are looking at a graph designed to show errors and oddly interpreting it as the actual position and size of the baseball.
My guess is that you, for whatever reason, think that the accuracy of a system monitoring the trajectory of a 2.9" diameter sphere relative to a three dimensional solid (the mathematical definition of the strike zone per batter) is somehow less than that of a human being armed only with a single visual perspective (and with human eyes) and in real time (70-100mph).
Why? The system is not translating a baseball into a ½" sphere, it is recognizing the actual baseball with its actual dimensions and following its precise trajectory relative to the exact specifications of the strike zone per the given batter.
I am sure you are not alone with that romantic view of the game. And I appreciate why you would feel that way.
To me, poor strike calls at the plate have been a pet peeve. Baseball has all sorts of randomness to make the game interesting. I, personally, do not like the idea that human error by umpires is a factor. If a pitcher is skilled enough to hit the corner of the strike zone then he should get the call. If the batter is skilled enough to hold off on a bad pitch, he should not be tagged with a strike due to human error on the part of the umpire.
To me it is ridiculous that pitchers and batters need to dynamically adjust their strike zones (and their strategy) because of human error by the plate umpire.
I like the beauty of a ball hit over the wall and within the foul line posts being an indisputable home run. It is determined by physics, not by human judgment. Similarly, if a basketball drops through the hoop, there is no question as to what took place. The games should be won or lost by the play of the players (and the strategy of the manager) and not by errant judgment calls of those who are there to enforce the rules. If the rules can be enforced by physics or by nearly infallible sensor-ridden automation then I am all for it.
No that's not it. A half inch triangle does not accurately reveal the location of a 3 inch sphere. It can't. They are different things.
If the triangle is the center, then it can be an inch off the strike zone and it appears to be a ball. But if the baseball's center were only an inch out of the strike zone, it would be a strike.
No, that's not what I said. I made two points relative to this.
First, the humans may not be as inaccurate as the computer indicates they are (for the reasons I just stated).
Second, any computer task only functions as well its programming. If human beings can disagree about the precise edges of a strike zone (and they do) then any program directing a computer to monitor the zone is entirely subjective. That makes it no more useful than just having people do the job.
Correct. Not sure I can explain this any clearer than what I already wrote. You are looking at a graph!. Why on Earth would you presume that the actual working system is determining strikes and balls with a ½ inch object (more like 1½ anyway) instead of a sphere of the exact dimensions of a baseball? Ask yourself why any engineer would do that. Makes zero sense. There is no advantage to such a practice; it is pure disadvantage.
You are basically arguing that the facts of the story are false; that the analysts are lying.
(see above)
Tacos!, all one need do is describe the strike zone mathematically, secure approval, and have an MLB-wide standard. The mathematical model of the strike zone can be a function of the physical properties of the batter (for low and high) and the physical properties of the plate (for the remaining solid geometry).
That would be fully sufficient for modern algorithms to detect when a regulation sized sphere (baseball) intersects the mathematical solid (3D geometry). Picking out the physical factors of the batter would be an application of A.I. (machine learning) that is well within modern capabilities. Plugging that into the equation and it is all mathematics from there.
I neither expect or think that it's your place to explain it, nor am I convinced you are qualified to do so. You didn't design the machines or the software driving them.
Because that is how the data is presented. There are no grounds for making assumptions beyond that.
Maybe because an engineer isn't a baseball umpire. I don't know. Having a background in science doesn't make one an expert in all things.
It would be nice to ask the people who designed the system. You are not those people, so you don't need to keep trying to stand in for them.
Yep. That's the problem.
One need not be the designer to note what would be a ridiculous design. Honestly Tacos!, do you actually believe that the designers of an ABS system would model the projectile as a golf ball rather than a regulation baseball? How, in any circumstance, does that seem plausible to you?
It is a picture. The article gives you the metrics. Why not go with the stated metrics (no ambiguity in what they say) instead of trying to argue that the analysts could not even interpret their own data based on your armchair interpolation of the size of icons in a graph?
I do not think you understood my question.
Uh, yes, true. Non sequitur, but true.
If you were an automotive engineer and someone claimed that they designed a pickup truck with a 624cc two cylinder engine, would you not point out the absurdity? The ABS designers know that MLB pitchers throw baseballs, not golf balls.
Replying with some kind of "no true scotsman" argument does not address by objections.
You mean I should just accept on their word that their test measures what they claim it measures? I shouldn't challenge it?
It's not a non sequitur at all. Analogy: A weather computer can't predict the weather if the computer programmer doesn't know all the the factors that drive weather and how to weigh them. He might be a great programmer, and his computer might work perfectly, but that doesn't matter if he doesn't know anything about weather.
The proof of that analogy is that weather predictions are wrong all the time and prediction programs are being updated all the time. Because they are fallible.
No one with any integrity would have the brass to come out and claim that they finally created the perfect weather predicting program.
Similarly, we have no information here about the knowledge, thought, and experience that went into this experiment. Critically, just because somebody says his machine can measure balls and strikes perfectly, that doesn't mean it can actually do that.
You have no idea what they know. But I can see that their data plot shows a bunch of little objects about the size of a thumbnail and also the totally wrong shape. So even if they actually do have the knowledge you claim (without evidence) that they have, their work product does not reflect such knowledge.
No, it's not showing objects. It's showing data points. Those data points are not meant to resemble a baseball; they are merely meant to represent where the baseball was within (or not, as the case may be) the strike zone. In fact, they were purposely made to have distinct shapes and colors for each team, presumably so that it was easy to see that calls for each team were similarly inaccurate - no agenda on the ump's part.
Tacos! I think you are just arguing for the sake of argument now. It is not a 'No True Scotsman' fallacy to observe that engineers who are creating an ABS system would gauge it for a baseball sized projectile. To not accept that obvious reasoning is just plain silly.
You are basing your challenge on a sample graph. This analysis is based on ~4 million pitches. If you suspect they do not know how to interpret their own data then I suggest you research this and get some hard facts. Your entire argument is based on the size of markers in a graph. I understand why this graph might cause you to be suspicious since it does not show 3" white circles but rather information-bearing symbols, but you have leaped from suspicion to a conclusion of potential gross incompetence on the part of the analysts. I find your leap to be unjustified and your conclusion to be highly unlikely since it calls for the analysts to be complete fools or liars.
Yeah it is. I have never suggested that scientists know everything. Your comment has nothing to do with this 'debate'.
We are talking about the software engineers and the analysts knowing that a baseball is an ~3" diameter sphere. To presume this fact escaped them is ridiculous. Arguing for the sake of argument.
True, but most people would grant that the manufacturer and the analysts of this ABS technology are smart enough to understand the size of a baseball and how the machine detects its intersection with the strike zone. If you have cause to doubt this then you need to do some research and get more data. You are just throwing up your hands because they did not show little circles on a chart. Again, your objection is frivolous given its basis.
So if you think these folks tracked ~4,000,000 pitches with modern ABS technology and then failed to comprehend the size of a baseball, I recommend you get better supporting facts than what you have presented.
LOL, I have utmost confidence that these people know the difference between a baseball and a golf ball and that they knew they were testing ~4,000,000 pitches of a baseball.
I think I will bookmark this exchange as a perfect example of arguing for the sake of argument (to the point of absurdity).
I understand what the intent was, but the method is incapable of doing that.
What is the ' method ' as you understand it? What do you imagine these folks did to test ~4,000,000 baseball pitches?
"Hey Joe, how big is a baseball?"
"I don't know, a couple of inches"
"Okay, well I guess I will gauge the projectile to 1½ inches"
"Sounds good to me, the smaller the better, right?"
I don't think the graph was designed to show errors; rather, it was designed to show location. I did this when I pitched in college, only I didn't have the aid of computers. We marked every pitch, noting type, location, result, etc. That's all this is, information.
I'm not hanging up on it. If I am hanging up on anything, it is the inclusion of the Pitch f/x chart in an article claiming pro umpires suck, and for the purpose of providing a visual example of why they suck. It doesn't work, for me, quite as well as the author had hoped, because I understand the strike zone is not a rigid box the entire ball must pass through. The chart depicts a rigid box, and the author is suggesting that anything bordering on or outside the box was clearly incorrectly called a strike. That is my hang-up.
I don't disagree with that. I disagree that what you said above is what is being depicted by the pitch f/x chart inserted into the article, because it is not.
Restating the no true scotsman fallacy doesn't make it go away. You keep trying to tell me what engineers "would do." That is not a fact. It's not evidence. It's just your faith in engineers.
Based on facts or on your faith in engineers? I have yet to see a fact supporting your defense of the graphic.
No, we are talking about whether or not they applied that knowledge. So far, there is no evidence that they did. In fact, the evidence indicates they did not apply that knowledge.
I don't presume it. I question it based on the evidence.
No such thing is going on from my end. I am challenging the presentation and doing so through an analysis of the specific qualities of that presentation.
You have presented no justification for that opinion beyond the no true scotsman fallacy I have already cited.
I don't know what they did or comprehended. I have expressed my qualms with the way data is presented and its resultant utility.
Again, that doesn't matter if they don't apply that knowledge in a way that is useful to the inquiry.
This line of commentary from you is a personal attack and meta - something you have a history of accusing others of doing. You want to mock me with Monty Python and accuse me of arguing just to argue but you have no justification for that. I gave you real reasons based on facts for my challenge. All you have in return is "Ask yourself why any engineer would do that" and "you are just arguing for the sake of argument"
I urge you to retract it and apologize and get back to the content of comments being made.
Exactly. 100% correct.
I will respond only to serious, respectful comments. This latest from you does not qualify. You are becoming more and more personal throughout this seed simply because a couple of people didn't blindly accept the data presentation and conclusions.
Do you think that the analysts ignored the fact that the baseball is ~3" in diameter and instead presumed a baseball is 1½" in diameter (which is what those icons seem to be based on the scale)?
I give them the benefit of the doubt. I strongly suspect they understand the size of a baseball and understand the technical concept of a strike.
We do not know how they actually counted correct balls and strikes. I suspect they picked a method that gives strong support to their conclusion knowing that if their method was flawed that their entire study would crash and burn upon scrutiny. Erring on the side of caution is prudent in something this straightforward.
That is, one can assume that the analysts were complete dolts who wasted the analysis of ~4 million pitches or one can grant the benefit of the doubt that they understood the concept of a ~3" baseball hitting the strike zone and erred on the side of caution.
That's my problem with it. If they want to fuck with something that involves judgment, automate the officiating in the NBA.
More arguing for the sake of argument. We are done.
Your comments presume the analysts do not comprehend the size of a baseball passing through a strike zone. You have nothing to back that up other than pointing to symbols on a chart and presuming the authors intended those symbols to represent the dimensions of a baseball.
It is ridiculous to presume the analysts do not understand the dimensions of a baseball and when that baseball intersects the strike zone.
If you have details on the specific method they used to deem a bad call then let's see it. The details of their specific method are not described in this article or in supporting articles. The most specific factors I have found are:
So on no information you presume the analysts in this study are incompetent or liars. Given they conducted an analysis of 4,000,000 pitches with MLB data and MLB equipment, I am inclined to not presume they are incompetent or liars until I have a reason to do so. Given this study was made for public consumption, the analysts are predisposed to err on the side of caution rather than be publicly humiliated.
Transyferous Rex objects to a fixed strike zone. And that is a valid point since the strike zone height and vertical position varies per player. We do not know how or if they accounted for that. We do not know what error tolerances they built into their analysis. If you have some information on these factors then present it. But the case you are making based on the information you have cited does not hold water.
Bottom line, unless we have information that shows these analysts incompetent, it is likely that they are competent and comprehend the size of a baseball intersecting an MLB strike zone. It is unlikely that they would engage in a 4 million pitch study and publicly provide damaging stats if their underlying method crumbles under scrutiny. And they certainly would know that people will challenge their method. Look at the noise you have made from a single graphic.
First and last warning on off-topic snark.
I think Mark T. Williams grabbed a chart from pitch f/x and is using it as an example of egregious calls, and likely one he thought was the best for that purpose. I don't know who Williams is. I wouldn't be surprised if Williams is unaware of what the rules of baseball are though, nor would I be surprised if he and his gang were ignoring pesky details such as the diameter of a baseball, when viewing a chart such as this.
The icon size is irrelevant. If the icons do not represent the center mass of the ball, then the chart is worthless. The icon size was chosen, most likely, for a simple reason...clarity. If the icons were proportionate in size to a regulation ball, then the individual pitches would be indistinguishable in areas.
So, if we are discussing ball size, any ball pitched, the center mass of which is 11.25" or closer to the centerline of the plate, crosses the plate for the purpose of a strike. If the chart shows center mass, then there are roughly 10-11 pitches outside the box that were likely correctly called strikes.
LOL, damn I guess it does sound like that. I certainly object to a determination that 34K+ pitches were incorrectly called, if that is the chart system they are using, and with no explanation as to whether or not their analysis included any consideration as to the diameter of the baseball. You have to admit TiG, that that information is not provided. I'd like to assume they know, understand, and accounted for that, but when a chart is included to show how bad the calls are, and I see balls that likely caught the corner...I have to wonder.
I'm having too much fun with this.
That is just one chart out of a data set of ~4,000,000 pitches.
Well I would be quite surprised if Williams and his team did not comprehend the dimensions of a baseball in a study of strike and ball counts on MLB pitches.
Exactly! They logically represent the center of the ball. What matters now is how that ball (realized as a regulation baseball) intersects with a 3D solid strike zone.
And we are back to the specific method used to determine if the sphere of the baseball intersects the strike zone. I cannot find those details. Until we have these details we should not presume that the analysts are incompetent or liars. As I have noted, it does not make logical sense for a team of analysts to engage in a 4 million pitch study, publish the results publicly, only to have their entire study discredited because they did not recognize the baseball as an ~3 inch sphere. Just does not pass the smell test.
That said, I also suspect the strike zone is fixed in height and position based on how Statcast works. And that would be something that would need to be factored in as an error tolerance (or addressed by a secondary source of data). So I am with you on the strike zone vertical positioning but I would need some real evidence to convince me that the analysts stupidly did not use the actual size of a baseball in their analysis.
I have been stating that the information on the specific method is not provided and suggesting that one not presume incompetence or dishonesty in a study of 4 million pitches that is intended to be made publicly available and thus subjected to scrutiny.
I tend to be cynical TiG. What is the motivation here? You are correct. It doesn't make sense to ignore the information that is not provided.
Williams also alleges that:
Trouble is, the chart he incorporates in his articles is not his, it is from pitch f/x. At least tell the reader "just look at the chart provided by Pitch f/x." He didn't do that though. He claims that he and his gang superimposed the data on a map, which assertion is immediately followed by a chart prepared by... not his team.To his semi-credit, Williams credits the fact underneath the chart. To me, that's dishonest, because he knows people will not generally read the fine print.
So, where is the standard strike zone map showing the data they collected and superimposed? I have a hard time believing the whole of a thing, when the smallest of things appears to be farcical. Thus, a portion of the reason I'm inclined to believe that Williams, and crew, looked at 4 million pitches, represented on multiple charts, prepared by someone else, and called everything outside the "strike zone" a bad call.
There are red triangles (called strikes) on the left and right side of the strike zone. If they represent the center of the ball, they may well have been strikes under the rules. If so, they were correctly called and should not be represented as mistakes.
There are a few green markers (called balls) within the zone that were called balls. Those are labeled as mistakes by the umpire, but that isn't certain.
The reason is because another thing we don't know is where these measurements were taken. The data points imply the measurement is made at a single point - a slice of the strike zone. Is it the front of the zone? The middle? It's possible a pitch could be outside the zone at the front but cut into the zone near the back. Does the computer call this a ball or a strike? We just don't know.
You keep trying to argue this strawman. I have expressed no conclusions about what the analysts understand. I have addressed the presentation of the data and the conclusions declared based on that data. That is all.
I tend to be skeptical. So I go by the evidence. When we do not have information, I try to not presume. In this case, sans evidence to the contrary, I am going to give the researchers the benefit of the doubt since they are naturally motivated to not embarrass themselves. Mark Williams in particular has a professional reputation that he likely is not willing to just toss away by a brain dead analysis that interprets baseballs as if they were golf balls. While it is possible that Williams et. al. are clueless about baseball, I find that to be unlikely.
I suspect the researchers were prompted to do this because ball and strike calls have been deemed errant based on available technology and by the comments of announcers in real time. If they approached this as professional researchers, they would be very careful of their data and would take measures to ensure that their conclusions are well justified against the scrutiny that would follow.
From the more detailed source I cited earlier:
This description, although not specific enough for us to scrutinize their method, reads as I would expect. This is what I would expect of a team of competent researchers.
From the article, in the caption under the picture, showing the 'semi-credit':
Sorry man, this is just too nit-picky for me. I go by the facts stated in the article (and in the supporting article). The inclusion of a single chart that no doubt was meant to help visualize the concept is not the hard data. It is the chart of a single ump in a single game ... a fraction of the 4,000,000 pitches. Questioning the veracity of the researcher and his team simply because you feel the stated credit should have been more prominent in this article is not giving me any pause.
Given this is an article, we would need to see the actual research report for that level of detail. I have not found it.
The smallest of things being this one chart for illustration? I am looking instead at the big picture. It seems entirely foolish for a prominent researcher to try to float such public bullshit. While it is possible that these researchers are incompetent boobs, I have no evidence supporting such an assessment and have yet to find any research that counters theirs. Further, the continued progress towards ABS is strong evidence that the MLB sees a reason for ABS. They are not dismissing the idea, they are embracing it. What does that tell you?
And when you find the details let us know. Until then you are presuming the researchers are incompetent based on an illustration while ignoring the balance of the article (and the common sense notion that these researchers will naturally be inclined to avoid public humiliation by publishing bullshit).
I have understood your point on the symbolic markers on the chart from the beginning. Repeating the same point accomplishes nothing. You are correct that we do not know the specifics of their method. So why do you presume their method would view a baseball as if it were a golfball? Why presume such incredible incompetence? I find that to be bizarre sans any supporting evidence.
As you can probably surmise, I reject your presumption that the researchers did not analyze the projectile as a regulation baseball but rather more like a golf ball. Flat out.
But here are some questions that I think are quite legitimate to raise:
These, to me, are reasonable questions to pose.
But presuming the researchers viewed the projectile as more of a golf ball than a baseball (the subject of the research) based on an illustration is just silly.
You claim to express no conclusions about what the analysts understand while in your next sentence express a conclusion on what the analysts understand.
The article is speaking of the findings of the analysts. You are using the chart to question whether they properly accounted for the size of a baseball. Their understanding of a baseball and how it interacts with the strike zone is the foundation of the study.
Another strawman. I never said they did any such thing. I do wish you would respond to the actual content of comments made instead of whatever is going on in your own mind.
Good. You finally concede that my objections are valid. I have made all of those objections but for some reason you feel it necessary to misrepresent my comments saying exactly the things you just endorsed.
It gives me pause. It is not a matter of simply not prominently citing the source, it is a matter of setting it up to appear as if you created the sourced material. If a person is not forthcoming on something as innocuous as correctly identifying the source of a chart, it is my experience that they will be even more inclined to withhold accurate information on something of substance.
Maybe their description is acceptable to you, but the entire study is based on one thing, separating the correct from the incorrect calls, and they do not even attempt to explain their methodology for making such determination. They simply make the separation, and start handing out scores. I would expect competent researchers to specifically identify the methodology used for determinations made, which are the foundation of the entire study.
I clicked the linked article, which appears to be different from the link you provided above. Funny. Several readers asked similar questions on how they determined the accuracy of the calls.
Who is Williams: He is the founder of UmpScores, a performance app used to measure MLB umpire accuracy. Again, I'm cynical. What's the purpose? Here is guy, trying to make money on claimed horrendous officiating in the major leagues, who apparently has failed, at multiple turns, to identify the methodology he is relying on to make such claims.
True, you compared their projectile (as you imagine it) to something more like a marble ( ½ inch ):
First of all they are indeed markers on a chart and not pictures of a baseball. Second, I was being generous scaling them up to a golf ball (since that is more like the size of the actual markers in the chart). Your ½ inch projectile would require the researchers to be even more inexplicably incompetent.
Wow.
You just now raised those questions for the first time in your last comment @ 3.1.38 . Look at your comment history; nowhere have you raised these points earlier and nowhere have I objected to them (since you never raised them):
Look at your comment above and show me where you raising a point other than your ½ inch object complaint (disregarding your meta commentary). You consistently complained of your perceived ½ inch markers not properly representing a 3" sphere. Your argument has been that researchers probably drew false conclusions because they interpreted what you called a ½ inch marker as if it were a 3" sphere. As if the researchers were physically looking at illustrative charts for 4,000,000 pitches.
Now, finally, @ 3.1.38 you for the very first time offer the following decent questions that have nothing whatsoever to do with a ½ inch projectile but about the concept of an actual baseball interacting with the strike zone :
Note that I did not object to the questions you raised. Instead, I acknowledged with a superset of questions that I think would be fair.
You refuse to acknowledge a point of agreement on new factors and instead choose to spin this as if I have conceded the argument you have been making about your perceived ½ inch projectile.
Just amazing.
There is too much personal meta in your comment and so - as I warned you earlier - I will not respond to it.
So because their methodology is not detailed in this article we should presume they are incompetent?
Me too, in their actual published report. Which, unfortunately, I cannot locate.
And asking about methodology is certainly legit. Presuming incompetence (or dishonesty) is another thing entirely.
Well apparently the MLB is not so cynical since they seem to be moving ABS into the majors.
LOL. Of course not. Your own comments refute your claim.
And quoting your comments with editorials is not meta nor is it personal. It is directly refuting your claim.
More meta. Personal.
I said I tend to be cynical. I did not say that I read the article assuming incompetence or dishonesty.
I'm not a fan of an electronic ump, that's no secret. Problem is, nothing is presented here, apart from conclusions. I'm not a fan of the use of a 2d chart to bolster the claim that umpires suck. The chart begs questions, it doesn't answer them. What was the pitch type? At what point in flight was the location marked; front of plate, middle of plate, point received by catcher? LHP or RPH? That info makes a difference.
As an aside, we would have had a preview, had it not been for covid. We can revisit this next year. I'm betting that we will be reading or hearing about inconsistencies from park to park.
Any new technology will have people finding fault. It is the nature of the beast.
What I predict (which is what I said upfront) is that in the future we will have MLB-wide automated ball/strike calls and they will be found to be far more accurate than human umpires and that plate umpires will no longer make the calls but rather simply make the gestures for the crowd.
The plate umpire will still deal with all the complexities at the plate such as foul tips, tagging, partial swings, time, balks, etc. but will not try to make calls on fast moving projectiles when modern technology can do so with far better accuracy.
In short, I am not at all calling for removing umps from the game (not saying you are claiming that I am). My focus is on reducing the errors in rule enforcement from pathologically affecting the game. The errors (a key interesting dynamic) in baseball should come from the managers and the players, not from the umpires.
I think it's a complex issue. (Disclosure: I umpired amateur games for 18 years)
As for the umpires themselves, the internet tells me the average age of a major league umpire is 46. At that age, a significant portion of the human race starts losing visual acuity. Maybe not the best age to be calling balls and strikes. On the other hand there is something to be said for having umpires with a lot of experience at the major league level, so I wouldn't just get rid of middle aged umpires.
Any umpire can make a mistake whether they are young or old. An older umpire might have poorer vision, but he has a lot more experience tracking tough pitches. It would be interesting to see how these accuracy experiments track with umpire age and experience.
As for the data, I just don't understand how it applies to real world umpiring. Those little triangles and squares look to be less than an inch across. A baseball is almost 3 inches across.
Which part of the ball is represented by those tiny triangles and squares? The middle? The edge? It matters a lot. If any part of that ball - including just a bit of stitching - crosses over any part of the plate (including the black part), the pitch can be a strike. The strike exists in three physical dimensions, with one of those dimensions having five sides. It is rarely represented that way in these computer scans.
There are a lot of pitches thrown that are technically strikes, but are almost impossible to hit. A slider might clip the front outside part of the plate and immediately tail away from the batter. A changeup might knick the front black and fall to the ground. A high curve might pass well over the strike zone but break down just as it gets to the pointy back edge of the plate. Any one of those is probably unhittable for most batters - even pros - and will look like a ball to fans. I've seen batters beaned by strikes.
And then there is the vertical component of the strike zone, which is somewhat vaguely defined and largely subjective. The top is a horizontal line midway between the top of the shoulders and the top of the uniform pants as determined by the batter's stance as he is prepared to swing at a pitched ball. It's a terribly problematic definition. Shoulders are in motion and often not parallel to each other. The tops of pants are easy enough, although players might wear them differently. The stance is obviously individual. Some are very static and others quite dynamic as they are in motion quite a bit.
The bottom of the zone is a point just below the kneecap. It sounds a little more exact, but it can be a hard thing to pick out through pants, especially as a batter strides.
All of this is to say that even with laser computer tracking, the programming might not be up to the task. The technology is interesting but seems to represent an ideal of perfect strike-calling that may not be possible given the rules as they currently exist.
In this case, the technology is far superior to human senses and there is no computational challenge detecting when a sphere crosses the boundaries of a defined mathematical solid (the strike zone).
From the article:
This is crazy bad but understandable. Nonetheless, we have the means to eliminate this controversial part of the game and I am all in favor of doing so.
We have had technology for years which can, for example, locate and read the license plate of an arbitrary car driving through a fixed point. Accurately determining strikes from balls (automated ball-strike or ABS technology) is well within our capabilities.
Modern sensors are substantially better than the human eye. A modern high-speed camera, for example, can capture motion at 250 frames per second. A 100 mph baseball is traveling at 176 inches per second. To a machine, the 100 mph baseball could be fully analyzed frame-by-frame if necessary. Current ABS systems use radar rather than cameras. The popular Trackman radar technology used to track golf drives accurately follows a 1.68" golf ball traveling at 200 mph with an effective frame rate of 2,000 frames per second.
Imagine the accuracy if a human being had that ability. The machine, in this sense, sees the pitch in slow motion. It has way more data than a human could ever acquire and can compute the intersection of a sphere with a 3D strike zone far faster and more accurately than the human mind. This is simply not a case where the human mind and human sensors are better than a machine.
Define the three-dimensional space (cube or variant such as a 3D plate) which represents the strike zone in terms of the plate and the physical characteristics of the batter. The technology (sensors and software) will then accurately determine if a 3" sphere coming from a known source 60' away crosses the 3D solid's boundary.
Finally, I am not making a case for the current technology under experiment (i.e. most notably Trackman for ABS). I am talking about technology in general. That is, I am arguing that we have the technological means today, if we wanted to, to produce an automated system for accurately deeming strikes (and thus balls) that would be substantially superior to human abilities. One that is superior even to Trackman. The human psychological resistance, minor techy problems, etc. will be addressed. We will in the near future have an ABS system that will virtually eliminate bad strike/ball calls and leave the umpires to focus on more complex calls such as balks, foul tips, partial swings, tag outs, etc.
This I think is Taco's point, the rules for defining the strike zone for each batter are somewhat nebulous already. So even if there was technology that could size up the batter and then render an appropriate strike zone, that needs to be better defined in the programming of the machine. Differences in stances, loading and striding would all need to be figured in. I'm not sure if the current technology used by broadcasters even accounts for the dimensions of the batter in the rendering of the strike zone. Looks like the TrackMan ABS technology does though.
This is not an issue. If the human eye can determine the knee then so can image processing technology. This is an application of machine learning. For example, a system can be trained to identify the body markers by observing batters in film and live. If a human can learn the key measurement spots then so can an A.I. algorithm.
This problem is less complex than, say, extant facial recognition technology available in your smartphone.
Kudos to the pitchers who can do that. They deserve to have the strikes they've earned called as strikes.
I think of being an umpire is a lot like being a moderator.
true story , a couple years ago i went with a friend to his Gf"s sons little league game , the ump made what everyone considered a bad call, and started heckling him , one of the jeers was the ump was blind and needed glasses , , the um whips off his face protector , puts on some glasses and turns around, i kid you not , these glasses had lenses as thick and looked like coke bottle bottoms , and he yells back at the crowd that he was NOT blind , ALMOST, but not blind.
the stunned silence defined the whole moment , followed by howls of laughter.....mine included .
How strange that you would see the similarity.
Judgement calls and attempting to ensure fair play. I can see it.
Yes the similarity is quite strong.
The reaction from 'the crowd' is the same too. The presumption that the umpire has an agenda or is incompetent.
Whether I thought an ump had an agenda when I was pitching was dependent on whether I had to throw it right down the middle to maybe get a strike called
I think most people don't realize how hard umping can be. When calling balls and strikes in MLB, you've got a ball coming your way at anywhere from 85 to 100 mph, with distractions and other details demanding your attention, such as checked swings. Add some movement on the ball, and a catcher who's good at framing a pitch, and a near miss looks a lot like working the corners. And you don't have much time to make the decision, so you just make it.
I agree, it is pushing our human senses past their limits. To me it is no mystery why mistakes are made.
Nor to me, but I think a lot of people are convinced that they could do just as well, or better. To their minds, since it's such an easy job, the only reason for not being able to do it perfectly is bias.
Is it? (-:
Well, LOL, not me. I am amazed that the umps do as well as they do. Some of these plays are milliseconds either way.
Milliseconds and fractions of inches.
Well, I did not feel I needed a /s tag for something so obvious.
It's been a long day. My sarcasm detector is on the blink.
Time for a glass of wine.
I think most try to be fair. The problem is that the human senses are errant and calling strikes and balls pushes our senses to the limit ... thus even the best umpires trying to be fair do indeed make mistakes.
The problem isn't with the Umps. It's with those who manage the Umps. Until they enforce the standards nothing is going to change.
What then is your opinion of these wrong calls? Is it human error or too much leeway in defining the strike zone?
If there was a 100% perfectly defined strike zone, do you think human umpires can be precise enough to make strike calls close to 100% correct?
It's both but part of the game is the human error. It makes it unpredictable and gives opportunity to the underdog.
I do think that some of the umps who are notorious for their errors should be kicked to the curb.
I fully agree that a very big part of the game is human error. Without human error (and the possibility of human error), baseball would be boring. I just think the error element should be exclusively with the players and managers; umpire errors distort the game.