Offensive Friction: A Few Thoughts on Baseball Metrics Along with a Proposal

Baseball statistics (and related metrics) promise clarity. After all, why else would people go to the trouble of creating them?

Metrics offer great appeal; a long, messy season (or even career) gets compressed into a number. A player’s apparent value becomes comprehensible. For example, OPS gives us a quick offensive summary,  wOBA improves the weighting of offensive events, and wRC+ places hitters on a clean scale, with 100 being the league average. Statcast data adds another layer, telling us not only what happened, but what probably should have happened.

Each step in the above-referenced progression appears to be progress. But toward what exactly?

The problem is not that all the available metrics are inadequate. The real issue is that they are often answering different questions.

A hitter can have strong results and weak underlying indicators. Another hitter can have excellent contact quality and disappointing production. A third hitter can look ordinary overall but deliver his best moments in the highest-leverage situations. A high BABIP, a favorable run of matchups, or a few well-timed home runs can elevate a fourth.

Which hitter is better, or at least more desirable? The answer depends on what we are trying to measure.

That is why the search for one perfect offensive statistic may be ill-advised. Baseball offense is not one thing. It is a collection of related but distinct realities: production, process, context, opposition, and sustainability.

The more interesting question may not be, “Which metric is best?” The better question may be: Where do the metrics disagree?

The first way to see this is to compare actual production with expected production. If every hitter’s season were in perfect statistical balance, the points would fall neatly along the diagonal line. They do not.

Figure 1. Data for 2026 through the end of April. R² ≈ 0.658

Figure 1 compares actual production with expected production by plotting wOBA against xwOBA for each hitter. The scatter reveals a range of divergence. Players above the line have outperformed their expected results, while those below it have produced less than their contact quality and plate appearances would suggest. Mickey Moniak, for instance, sits well above the line, indicating stronger outcomes than underlying indicators might predict. In contrast, Ketel Marte and Jake Cronenworth fall below it, suggesting that their process may be better than their results to this point. The figure does not resolve which measure is more meaningful, but it makes visible the gap between them, which serves as the starting point for a proposal I will make later in the post.

 

The limits of a single number

 

One of the best offensive metrics in wide use is wRC+ because it does something very specific. It estimates a hitter’s total offensive production, adjusts for park and league factors, and places that production on an easy-to-read scale. A 120 wRC+ means a hitter has been 20 percent better than league average. A 90 wRC+ means he has been 10 percent worse than average.

That is useful and elegant. Perhaps more importantly, it is also intentionally incomplete.

wRC+ is not trying to tell us whether a hitter’s production is sustainable. It is not trying to tell us whether he has been lucky. It is not trying to tell us whether his best events came in the most important moments. It is not trying to measure the quality of the pitchers he faced in every plate appearance.

That is not a flaw. It is a design choice.

The trouble begins when we ask wRC+ to do more than it was built to do.

The same is true of expected statistics. xwOBA can tell us something about a hitter’s contact quality and plate appearances. It can suggest whether the underlying process supports his results. But xwOBA is not the same as actual value. A lineout with a high expected value may tell us something important about skill, but it did not move the runners. It did not change the scoreboard.

The expected value and the actual value are both real, but in subtle and nuanced ways.

This is where offensive analysis becomes much more interesting.

 

Production, process, and context

 

Consider three hitters.

The first hitter has a high wRC+, a high xwOBA, strong exit velocity, a reasonable BABIP, and a stable strikeout-to-walk profile. There is not much mystery here. The production and the process agree. His production most likely matches his ability.

The second hitter has a high wRC+ but a modest xwOBA. His BABIP is unusually high. His barrel rate is ordinary. His hard-hit rate is fine but not exceptional. The results are good, but the foundation is less convincing. He may still be a good and accomplished hitter, but the numbers are not speaking with one voice.

The third hitter has a poor batting average and mediocre production, but his xwOBA is strong. He hits the ball hard. His launch angle is improving. His walk rate is stable. His BABIP is low. This is the kind of player who may be better than his surface line suggests.

Analyzing the first hitter is straightforward; the real investigation begins with the second and third hitters. They are not noteworthy because one number tells us the answer. They are interesting because several numbers are arguing with each other.

That disagreement deserves to be measured.

 

Offensive Friction

 

I am calling this idea Offensive Friction (OFx).

Offensive Friction is not meant to replace wRC+, wOBA, OPS+, xwOBA, BABIP, or Statcast indicators. It is meant to sit beside them and mediate disputes.

Its purpose would be simple: Identify hitters whose offensive indicators disagree.

A low-friction hitter is easy to interpret. His production, expected production, contact quality, plate discipline, and luck indicators all point in roughly the same direction.

A high-friction hitter is harder to interpret. His numbers contain tension. One part of the profile says breakout. Another says regression. One part says unlucky. Another says limited. One part says star. Another says mirage. That tension is the signal.

In conceptual terms:

Offensive Friction = the variance among a hitter’s standardized offensive indicators

The inputs could include:

wRC+

xwOBA

BABIP

Barrel rate

Hard-hit rate

Average exit velocity

Launch angle

Walk rate

Strikeout rate

Chase rate

Context value

Each metric would be converted into a standardized score. Then we would measure how widely those scores spread apart.

A hitter whose scores cluster together would have low Offensive Friction.

A hitter whose scores scatter across the map would have high Offensive Friction.

This would not tell us who is having the better season, but it would tell us who deserves a closer look.

Once the indicators are standardized, we can ask a different question: not who has the best offensive production, but whose profile contains the most tension.

Figure 2. Data for 2026 through the end of April.

Figure 2 introduces the idea of Offensive Friction in its simplest form by ranking hitters according to the degree of disagreement across their standardized offensive indicators. Rather than asking who has been most productive, the figure asks whose statistical profile is the most internally unstable. Players at the top of the chart, such as Cedric Mullins, exhibit the widest spread across metrics, with some indicators suggesting strength and others pointing in a different direction. Others near the top, including Luis Arraez and O’Neil Cruz, show similar patterns of tension. By contrast, players further down the list have profiles in which the underlying numbers cluster more tightly together, indicating a more coherent and interpretable performance. The purpose of the figure is not to evaluate quality, but to identify where the numbers themselves are in disagreement, highlighting the players who warrant closer inspection.

  

Why disagreement matters

 

This is the part that is perhaps most interesting.

Baseball analysis usually treats disagreement as a problem to be solved. One metric says this. Another metric says that. We want to know which one is right, or at least most useful.

But maybe the disagreement itself is what we should be after.

A hitter with a 150 wRC+ and a 150 xwOBA+ is excellent, but not analytically mysterious. His results and process agree.

A hitter with a 150 wRC+ and a 100 xwOBA+ is different. His season may be productive, but the underlying indicators suggest caution. Maybe he has been fortunate. Maybe he has exploited a particular defensive pattern. Maybe he has hit a few poorly struck balls at perfect times. Maybe the expected model is missing something.

Either way, the disagreement is worth studying.

The reverse is also true. A hitter with an 85 wRC+ and a 125 xwOBA+ may be a rebound candidate. His results are poor, but the contact quality suggests something better. That does not mean improvement is guaranteed. It means the surface line may not be telling the full story.

This is where Offensive Friction could be useful. It would act as an alert system.

High friction would say: Do not stop at the leaderboard. Something interesting is happening here.

 

The equilibrium idea

 

There is another way to think about this.

Baseball performance is often moving toward equilibrium.

A hitter’s batting average may run hot for a few weeks. His BABIP may drift above his career norm. His home run rate may spike. His strikeout rate may briefly collapse. Early in a season, small samples can make ordinary players look transformed and struggling players look finished.

But over time, many numbers begin to settle.

Not always. Players do change. Swing paths change. Plate discipline changes. Strength changes. Health changes. Aging changes everything.

Still, the concept of equilibrium matters.

A hitter is close to offensive equilibrium when his production matches his process. His wOBA is close to his xwOBA. His BABIP is not wildly out of line with his batted-ball profile. His strikeout and walk rates fit his established skill set. His power output would be supported by contact quality.

A hitter is out of equilibrium when those pieces do not line up.

That disequilibrium can mean several things.

It can mean luck.

It can mean injury.

It can mean a real skills change.

It can mean a player is being misread by traditional statistics.

It can mean the model is missing something.

This is why the disagreement matters. It is not just noise. It is a clue.

 

A possible classification system

 

Offensive Friction could help classify hitters into types.

Type Profile Interpretation
I High production, high process, low friction The numbers agree
II High production, weak process, high friction Results may be ahead of skill
III Low production, strong process, high friction Better than the surface line
IV Ordinary overall profile, high leverage value Value concentrated in key moments
V Average production, average process, low friction Little mystery
VI Strong changes across some indicators, conflict across others Real change or temporary spike

This kind of framework would be more useful than another leaderboard.

It would not simply tell us who ranks first. It would tell us what kind of interpretive problem each hitter presents.

That is important because a baseball season is not just a sorting exercise, it is a diagnostic exercise. We are not only asking who has performed well. We are asking what that performance means.

Friction tells us that the numbers disagree. The Equilibrium Gap tells us the direction of that disagreement.

 

Figure 3. Data for 2026 through the end of April.

Figure 3 places Offensive Friction alongside overall production, allowing us to see not just how well a hitter has performed, but how stable or interpretable that performance is. The horizontal axis measures the degree of disagreement among a player’s underlying indicators, while the vertical axis reflects his overall offensive output. The quadrant structure provides a simple framework: hitters in the upper left combine strong production with internal consistency, while those in the upper right are producing at a high level but with profiles that contain tension, making them less certain going forward. The lower right quadrant is especially interesting, as it captures players with weak results but high friction, suggesting that their underlying indicators may point to something better than the surface line. Cedric Mullins, for instance, falls into this region, pairing low production with a highly unstable profile. Meanwhile, players like Luis Arraez and O’Neil Cruz occupy the high-friction, higher-production space, where strong results coexist with less agreement beneath the surface. The figure does not resolve which interpretation is correct, but it identifies where the most interesting analytical questions reside.

 

The philosophical problem

 

Every baseball metric contains a philosophy.

OPS values simplicity.

wOBA values proper event weighting.

wRC+ values context-neutral offensive production.

xwOBA values underlying process.

WPA values game situation and timing.

BABIP points us toward luck, contact profile, and defensive interaction.

None of these numbers is the whole truth. Each one chooses a version of its specific truth.

That is why one-number arguments can become misleading. A player can be more valuable than he is skilled. He can be more skilled than he has been productive. He can be productive in a way that is unlikely to continue. He can be unlucky without being good. He can be lucky and still be excellent.

The categories overlap, but they are not identical.

This is why I am prososing the idea of Offensive Friction. It does not pretend to solve all of this. It begins by admitting the complexity.

The goal is not to flatten the hitter into one final answer.

The goal is to identify where the narrative bends or even breaks.

 

What this would add

 

A metric like Offensive Friction would be especially useful early in the season.

In April and May, leaderboards are unstable. A few bloop hits can inflate a batting average. A few warning-track outs can suppress a slugging percentage. One series in a favorable ballpark can distort the picture. One bad week can make a good hitter look lost.

A friction model would help distinguish stable from unsettled performance.

It could identify:

  • players whose hot starts are supported by process,
  • players whose hot starts look fragile,
  • players whose poor results hide strong underlying skill,
  • players whose surface numbers and expected numbers are beginning to diverge,
  • players whose profiles have genuinely changed.

That is more interesting than simply ranking hitters. It gives us a way to ask better questions.

The same friction score can come from very different profiles. A radar view helps show why one high-friction player may be a mirage, while another may be a hidden riser.

 

 

Figure 4. Data for 2026 through the end of April.

Figure 4 shifts the focus from outcomes to structure. Each polygon represents a hitter’s standardized offensive profile across several underlying indicators, allowing us to see not only how good a player has been overall, but also how his components align or diverge. A more balanced, compact shape suggests agreement among metrics and a profile closer to equilibrium. A jagged or uneven shape reveals tension, where certain indicators pull in different directions. Cedric Mullins, for example, displays a visibly uneven profile, with strengths in some areas offset by weaknesses in others, a hallmark of high friction. Ketel Marte shows a more coherent structure, with metrics that move together more consistently. Jake Cronenworth sits between these extremes. The purpose of the figure is not to rank hitters, but to reveal the internal shape of their performance, highlighting where the underlying indicators agree and where they do not.

 

  Conclusion

 

The future of offensive analysis (and defensive and pitching as well) may not be another statistic that claims to replace the old ones. It just might be a model that explains why the old ones disagree.

That is the larger lesson. Baseball offense (and defense) is not a single reality. It is actual production, expected production, contact quality, plate discipline, timing, opposition, luck, and sustainability. Each metric captures part of that structure. None captures all of it.

So maybe the most interesting hitters aren’t always the best. Maybe they are the hitters whose numbers have not yet settled into agreement.

That is where the analysis should begin. Because sometimes the story is not found in the statistics themselves. Sometimes the story is found in the friction between them.

 

 

The Shape of the American League (so far): A Three-Dimensional Look at Team Strength

Wins and losses tell us what has happened. Composite metrics can help explain why.

Using offensive, pitching, and defensive data through May 12, 2026, I combined multiple American League team metrics into a standardized z-score framework. Each category was normalized relative to league averages, allowing offensive production, run prevention, and fielding quality to be evaluated on the same scale.

Rather than relying on a single statistic, this approach attempts to measure organizational balance. Teams receive positive scores when they perform above league average and negative scores when they fall below it. For pitching categories such as ERA and WHIP, lower values were inverted so that stronger performance always resulted in higher z-scores.

The result is less a standings table and more a multidimensional map of each team’s underlying quality.

Figure 1: Composite AL Team Strength Through May 12, 2026

Yeah, the Yankees stand alone. They are out there by a large margin.

New York’s profile is unusually complete. They combine the league’s strongest offensive output with elite pitching performance, producing separation that becomes obvious once the categories are standardized. The offensive metrics are overwhelming enough on their own, but pairing them with the AL’s best ERA and WHIP creates a profile that resembles something other than just a hot start.

Perhaps most importantly, the Yankees are not merely winning through one dominant dimension. Many early-season contenders are sustained by either explosive offense or temporary pitching overperformance. New York grades strongly in both simultaneously. That is a big deal.

The Astros occupy a fascinating second tier. Houston’s offense remains extremely dangerous, leading the league in batting average while ranking near the top in slugging and run production. Yet the pitching profile is significantly weaker than expected, especially relative to prior Astros teams. Their overall placement illustrates how overwhelming offensive production can partially compensate for poor run prevention, at least over a 40-game sample.

The Athletics may be the most surprising analytical team in the league so far. Their composite score benefits from a quietly balanced structure. They field exceptionally well, avoid major pitching collapse, and generate enough offense to remain consistently above average across categories. This is not a team built around dominance. It is a team built around the absence of glaring weakness.

Cleveland fits a similar pattern. The Guardians do not dominate the league in any single category, but they remain consistently competitive across all three phases of the game. Their strong fielding profile, solid strikeout numbers, and competent offense produce one of the most stable composite structures in the American League.

Seattle grades better analytically than its current record suggests. The Mariners continue to pair strong pitching with quality fielding, even though the offense remains uneven (at best). Their underlying structure implies a team that could improve substantially if the bats normalize.

Meanwhile, Tampa Bay presents one of the more interesting contradictions in the league. The Rays possess one of the best records in the AL, yet their composite z-score profile remains only modestly above average. This may indicate sequencing luck, strong leverage performance, or simply an ability to maximize close games. Interestingly, this has long been a recurring characteristic of Tampa Bay baseball.

At the bottom of the rankings sit Baltimore and Boston. Neither team displays a single catastrophic weakness. Instead, the issue is cumulative mediocrity. Once standardized, multiple slightly below-average categories compound into significantly negative total scores. The Orioles, in particular, have struggled to prevent runs while failing to separate offensively from the league middle.

This raises an important analytical point. Baseball teams are often discussed in singular terms: “great offense,” “elite rotation,” “bad defense.” But actual team quality emerges from interaction effects across systems. Strong fielding can amplify pitching. High-strikeout staffs reduce defensive volatility. Power-heavy offenses can partially absorb bullpen instability.

The z-score approach attempts to capture some of that interconnected structure. As you know, it is a favorite strategy of mine.

No model perfectly predicts future outcomes, especially in May. Small samples remain volatile. Injuries reshape rosters quickly. Regression arrives unevenly. Yet early-season standardization can still reveal organizational identity. Some teams already appear structurally coherent. Others appear fragile despite respectable records.

And at the moment, one conclusion appears difficult to avoid: The Yankees are not simply leading the American League; they are performing as a dominant team would.

 

Defensive Ecosystems Behind the Plate: How Good is Patrick Bailey?

I grew up in, and still live in, Northeast Ohio. I have long suffered from rooting for the Indians, now known as the Guardians. It hasn’t been a pleasant journey. The last time we won a World Series was 1948, and I do not see another victory on the horizon. So it goes.

I woke up early this morning after hearing we traded for Patrick Bailey, a two-time Gold Glove-winning catcher. I knew Bailey was good. I decided to see just how exceptional he is behind the plate. I used 2025 data for the following study.

Baseball analysis often reduces catchers to a handful of familiar metrics. Framing. Pop time. Arm strength. Caught stealing percentage. Blocking runs. Yet the position itself resists simple categorization. Some catchers suppress the running game through elite exchanges and quick releases. Others survive on receiving skills and pitch presentations. A few manage to combine multiple defensive strengths into unusually complete profiles.

This analysis attempts to move beyond simple rankings by examining the structure of catcher defense. Rather than asking merely who the best defensive catchers were, I wanted to explore a deeper question: Are there distinct defensive ecosystems among modern MLB catchers? Note that this post does not consider how a catcher handles a pitching staff.

To investigate this, I combined multiple publicly available defensive datasets covering:

  • blocking
  • throwing
  • framing
  • exchange time
  • pop time
  • arm strength
  • caught stealing metrics
  • related subcomponents

Every metric was standardized using z-scores to allow catchers to be compared on a common scale (something I often do). From there, the project unfolded in several stages:

  1. creation of composite defensive scores
  2. principal component analysis (PCA)
  3. hierarchical clustering
  4. dendrogram construction
  5. an “Unusualness Index” measuring statistical distance from the average catcher profile

The result was less a ranking exercise and more an exploration of what can be termed defensive geography. And yes, all this work was done because I was curious about our new catcher.

 

Building the Defensive Landscape

 

The first step was to construct an overall defensive score across three broad categories: blocking, throwing, and framing. Each category itself was built from multiple underlying z-scored metrics.

 

 

This approach allowed catchers to be evaluated across multiple dimensions simultaneously rather than through isolated statistics. But even this composite score quickly revealed an important limitation: Two catchers could arrive at nearly identical defensive totals through very different defensive pathways. That observation became the central motivation for the clustering analysis.

 

PCA and Defensive Geography

 

Principal Component Analysis compresses high-dimensional data into a smaller number of interpretable axes. I don’t know about you, but I find it very difficult to think in anything more than two dimensions.

In this dataset:

  • PC1 explained approximately 36% of total variance
  • PC2 explained roughly 19%

Together, they created a two-dimensional map of modern catchers’ defensive strategies.

The PCA visualization immediately suggested that catchers naturally separate into different defensive archetypes rather than forming a single continuous population. The discussion does get nuanced.

Some clustered around framing skill. Others around throwing and athleticism. A handful appeared unusually isolated. Most notably, Patrick Bailey emerged not only as one of the strongest overall defenders in the dataset, but also one of the most statistically unusual.

Figure 1: PCA Map of Defensive Catcher Profiles

The clusters in the PCA plot represent groups of catchers with similar overall defensive structures rather than similar rankings. A catcher can therefore occupy the same broad defensive tier as another player while still existing within an entirely different defensive ecosystem. This result somewhat surprised me.

 

Hierarchical Clustering and Catcher Archetypes

 

To explore those ecosystems further, I applied Ward hierarchical clustering to the standardized defensive profiles.

Unlike simple rankings, hierarchical clustering groups players according to the shape of their statistical profiles. I find this to be an interesting way to look at the data. For example, Patrick Bailey and Alejandro Kirk finished with very similar overall defensive scores. Yet the clustering analysis separated them because they appear to provide defensive value through different skill combinations.

Bailey profiles as a rare hybrid:

  • elite framing
  • strong throwing traits
  • positive blocking metrics

Kirk, meanwhile, appears more specialized toward:

  • framing
  • blocking
  • receiving skill

The dendrogram reveals these structural differences visually.

Figure 2: Dendrogram of Top Defensive Catchers

Several major catcher ecosystems emerged:

Cluster 1: Elite Defensive Hybrids

These catchers combined strong framing with excellent throwing skills. Representative players:

  • Patrick Bailey
  • Austin Hedges
  • Tyler Heineman

Cluster 2: Athletic Throwing Specialists

These catchers leaned heavily into:

  • arm strength
  • exchange speed
  • suppression of the running game

Representative players:

  • J.T. Realmuto
  • Endy Rodríguez

Cluster 3: Balanced Traditional Catchers

The statistical center of gravity for the position.

Competent across categories without extreme specialization.

Representative players:

  • Gabriel Moreno
  • Christian Vázquez

Cluster 4: Offense-First or Declining Defenders

Catchers whose defensive metrics trended negatively despite offensive value or prior reputations.

Representative players:

  • Salvador Perez
  • Yainer Diaz

Cluster 5: Extreme Outliers

In this case, Agustín Ramírez emerged as a statistically isolated profile unlike any other catcher in the dataset.

 

The Unusualness Index

 

One of the more interesting outputs of the project was the creation of an “Unusualness Index.” Conceptually, it measures how far a catcher’s defensive profile lies from the league-average catcher.

Mathematically:

Large values indicate:

  • rare defensive combinations
  • extreme strengths or weaknesses
  • hybrid skill profiles
  • statistical isolation

Interestingly, some of the most unusual catchers were not necessarily the best overall defenders. That distinction may be one of the most important findings in the study. Elite value and statistical uniqueness are related, but they are not identical concepts.

 

Figure 3. Most unusual catchers in 2025.

 

The Patrick Bailey Question

 

Perhaps the most fascinating result involved Patrick Bailey. Bailey ranked near the top of the defensive leaderboard while also appearing among the most unusual defensive profiles in the dataset. That combination is rare.

Most players become unusual because they possess one overwhelming specialization or weakness. Bailey appears unusual because he performs unusually well across multiple difficult defensive dimensions simultaneously.

The clustering analysis, therefore, suggests that Bailey is not merely “good.” He may represent a relatively uncommon defensive archetype altogether. The man is special.

Figure 4. Most accomplished defensive catchers in 2025.

 

Final Thoughts

 

Traditional baseball analysis often searches for single metrics capable of defining defensive quality. But catcher defense appears fundamentally multidimensional.

There is no single pathway to defensive value behind the plate.

Some catchers thrive through receiving.
Others through athleticism.
Others through balance.
A few through genuinely rare hybrid profiles.

The PCA map and dendrogram reveal something that simple rankings cannot: catcher defense is not a ladder. It is an ecosystem. And within that ecosystem, certain players appear to occupy unusually isolated terrain. I can’t wait to see Patrick Bailey in a Guardians uniform.

 

The Shape of a Decision: MLB Catcher Stances

Back in the 1970s, I had a small black-and-white TV in my bedroom. I would watch baseball games late into the night as I fell in and out of sleep. I was, of course, a Cleveland Indians fan, so those games were at the top of my list. If the Indians were off, and I switched to channel 35, and the antenna was just so, I could get Pittsburgh Pirates games. I watched a lot of their games as well.

One of my best memories of those Pirates games is watching the great Manny Sanguillén catch. He was unusual in that he would drop a knee while waiting on the pitch. I also recall him sticking the other leg way out to the side so that he could get lower in his stance. I don’t recall any other catchers dropping to a knee in that era. He was a great player, and he remains one of my all-time favorites.

As I watch games today, I am seeing all the catchers drop to a knee as the pitcher winds up. Certainly, this has to be worthy of a post, right? As it happens, I found some interesting stuff. The following two figures tell part of the story, and it is a story worth hearing.

The first figure is straightforward. The percentage of pitches received from a one-knee stance rises from 23% in 2020 to 96% in 2026. What begins as a minority behavior becomes, in short order, the default condition of the position. By the end of the period, the alternative has nearly disappeared.

The second figure complicates this.

Instead of levels, it shows change. Year-to-year percentage growth in one-knee usage spikes dramatically early, then declines just as quickly. The initial jump exceeds 100%. After that, the rate of increase falls, first sharply, then gradually, until it approaches zero.

Taken together, these figures describe something more precise than simple adoption. They describe timing.

Interestingly, the decision appears to occur well before the endpoint. The first figure suggests a continuous rise through 2026. The second suggests that the meaningful shift happens earlier, closer to 2021–2023. After that, the system is no longer deciding. Decisions have been firmed up, and consolidation has taken place.

Perhaps most importantly, this pattern aligns with a familiar structure. Early adopters move aggressively, often extracting outsized value. My bet is that this is due to pitch framing. The rest of the system follows, not because the marginal gains remain large, but because the uncertainty has been resolved. Once that threshold is crossed, the behavior spreads regardless of diminishing returns.

This raises a natural question. If the rate of change collapses while the level continues to rise, what is driving the final stages of adoption?

The answer is likely institutional rather than individual. At some point, the technique ceases to be optional because the data says that it is the right thing to do. Consequently, it becomes embedded in instruction, in development, and (most importantly) in expectation. Young catchers entering the league are not choosing the one-knee stance. They are inheriting it.

This is where the first figure can mislead if taken alone. A rising line suggests ongoing discovery. The second figure suggests the opposite. Discovery is front-loaded. What follows is replication.

There is also a subtle implication for evaluation. If most of the informational gain occurs early, then later adopters are operating in a different environment. They are not testing a hypothesis. They are implementing a standard. Any performance differences observed in the later years must therefore be interpreted within a system that has already converged.

That convergence is the quiet endpoint of the process. By 2025 and 2026, the rate of change is minimal. Not because the idea has failed, but because it is clearly the right thing to do. The system has reached equilibrium.

And so, the two figures resolve into a single observation. The transformation of catching technique did not take six years; it took two or three.

Now comes the surprising part. I wondered which knee should be dropped. I know all the catchers are right-handed, so handedness is not a consideration. Fortunately, I found raw data on this. Take a look at the following figure. I find it fascinating.

The figure reveals a subtle but meaningful asymmetry in how the one-knee stance has been adopted. Early in the period, the right-knee-up configuration is more common, but over time, the balance shifts decisively toward the left-knee-up orientation. By 2026, the split is no longer close, with the left-knee-up approach clearly dominant. Interestingly, this suggests that the evolution of the stance is not simply about going to one knee, but about settling into a preferred directional setup. The change appears gradual rather than abrupt, implying that once the broader adoption decision was made, the league continued to refine how the stance is executed rather than whether it should be used at all.

How about that? As of now, I have no idea why the preference shifted. The only thing I know is that it certainly was data-driven. My best guess is that it has something to do with giving the umpire a certain perspective when pitches are being framed. That look being the one the catcher wants the ump to have.

I am open to ideas. If you like, let me know what you think in the comments. This post certainly is fodder for a spirited discussion.

 

 

More on AL First Basemen 4 23 26

Here we are, a day later and (hopefully) a little wiser. I have more to say about yesterday’s post on those pesky first basemen. I can now tell you exactly how lucky or unlucky all of them have been so far this season.

BABIP, batting average on balls in play, is the metric used to determine the role of luck in a player’s offensive output. Here is the simple equation:

 

Where:

  • H = Hits
  • HR = Home Runs
  • AB = At-bats
  • K = Strikeouts
  • SF = Sacrifice flies

This equation shows what happens after contact is made and a ball is put in play. League-wide, and this has been true for a long time, players hover around .300. A BABIP of .300 is the de facto gravitational center for all players.

Take a look at this:

This is how we want to read the data. Guerrero (.378), Rice (.378), and Kurtz (.364) have been extremely lucky so far this season. I do not think they can maintain BABIPs that strong much longer. They will certainly regress to the league mean of ~.300.

On the other hand, Naylor (.213) and Pasquantino (.169) have been very unlucky. Those line drives are being hit right at fielders, and the hard-hit ground balls are not finding any holes. Both men should see their offensive production increase as their BABIPs work their way toward .300.

What about our man, Kyle Manzardo? At .317, he has not been unlucky at all. Not only has he not been unlucky, but he has likely benefited from slightly favorable outcomes on balls he has put in play. This means that balls are finding gaps at a slightly elevated rate, defensive positioning or variance is working in his favor, and there is no immediate signal of suppressed results due to bad luck. Once again, I was a bit surprised by this.

Manzardo’s BABIP implies that there should be downward pressure on his offensive production. Remember the graph from the last post? This does not bode well for a player whose output has been last in the league at his position. I am curious to see how this plays out.

 

AL First Basemen Thru 4 22 26

I got up early today and decided to take a look at what has been happening in the American League with all the first basemen. I was inspired after I heard that Kyle Manzardo was having a very unlucky season at the plate. It happens, and there are very good metrics out there to measure things like luck.

I want to take a glance at what is going on about 25 games into the 2026 season. The first figure shows which players have the most similar production. Notice that Manzardo’s offensive output is most closely related to that of his old teammate, Josh Naylor. Both are off to very slow starts.

The players on the right-hand side of the figure should come as no surprise. What might give you pause is the next figure. Instead of clustering players based on similar offensive numbers, I decided to analyze them only by the categories that help their teams win. In other words, I eliminated things like strikeouts and double plays that the player might have grounded into.

Our man Manzardo is all alone at the bottom of the list. I believe he is too good a player to remain there. The same is true for Naylor, and all the players with low rankings will probably increase their production as the weather warms.

If you study the chart, you will see that Ben Rice has been, by a wide margin, the most productive offensive first baseman in the American League so far this year. I was a bit surprised by this. I will keep an eye on things and report back throughout the season. Depending on the will of the Muses in control of The Boys of Summer, I might expand my horizons to every position in both leagues.

 

A Few Thoughts on MLB Batting Averages and Scoring

The folks with a serious interest in baseball have been meticulously recording the numbers the game generates since the 19th century, giving us one of the longest continuous statistical datasets in professional sports. Using MLB league totals from 1871 through 2025, I have traced the story of offense through a single, elegant metric: runs per game per team (R/G).

The chart below (based on raw data graciously provided by baseball-reference.com) visualizes the average runs scored per game per team by decade, beginning in the 1920s—an era often considered the dawn of modern baseball. I view 1920 as the beginning of the modern era, mainly due to the standardization of the balls used in the games. Before this date, the balls were haphazardly procured; there were no standards imposed, and none were implied. One game might finish with a score of 43 – 36, and the next might be 2 -1. This was a result of the baseball ( and yes, I mean singular ball) used in the game.


The figure tells an interesting story:

  • 1930s: Offensive explosion. The live-ball era fully matured, and league scoring topped 5 runs per game.

  • 1960s: The “Pitcher’s Decade.” Offense collapsed, bottoming out at 3.7 R/G in 1968—the “Year of the Pitcher.”

  • 1990s: The power surge. League scoring rebounded to nearly 5 runs per game, driven by expansion, smaller parks, and the home-run boom. Surely, there are no other explanations, right? Cough, cough, hack, hack…

  • 2020s: The analytics paradox (but not really). Despite smarter lineups and stronger hitters, offense has fallen again, down to 4.4 R/G in recent seasons. More on this later…

BATTING AVERAGES

 

While run scoring has fluctuated wildly, the league batting average has remained remarkably stable. From 1920 onward, the overall mean is .262, almost identical to the all-time mark of .260 since 1871.

The highest batting averages came during the explosive decades of the 1920s and 1930s, while today’s hitters hover around .245, the lowest sustained level since the Dead Ball Era (1900-1920).

ANALYTICS

The offensive (and defensive) landscape of MLB can’t be understood without the analytics revolution, which ushered in a seismic shift in how teams interpret performance. It is, without doubt, the most transformative movement in the history of the game.

Baseball’s analytics revolution unfolded in three waves. The first began in the late 1970s, when writer Bill James published his Baseball Abstracts and coined the term “sabermetrics,” introducing a generation of fans and front offices to the idea that baseball could be studied scientifically. The second wave arrived around 2000, when the Oakland Athletics—immortalized in Moneyball—used data-driven roster construction to compete on a small budget. Their success sparked a league-wide shift toward on-base percentage, run efficiency, and market inefficiency analysis. The third and most mind-bending stage came in 2015 with the introduction of Statcast, a tracking technology that measures exit velocity, launch angle, spin rate, and player movement in real time. Together, these eras changed baseball from a sport of intuition to one of precision, where every swing, pitch, and sprint is quantified and optimized.

The following chart overlays those analytical milestones onto league scoring trends. Note how the average runs per game increased steadily until mathematics started to play a central role in baseball strategy.


  • 🟠 2000 – Moneyball / Analytics Era: Teams begin valuing on-base skills and cost efficiency.

  • 🔴 2015 – Statcast Era: Tracking technology transforms player evaluation and biomechanics.

Interestingly, runs per game spiked during the early pre-Moneyball years (late 1990s) but declined sharply once every team adopted similar analytical models. The advantage disappeared as the playing field leveled and pitchers harnessed data to exploit hitters’ weaknesses. League-wide defense also vastly improved; the players had a much better idea of where to position themselves batter by batter and pitch by pitch.

THE APPARENT DATA PARADOX

Baseball-flavored analytics were initially designed to optimize offense, yet their full integration has arguably optimized defense and pitching instead. By 2025, batting averages and runs per game are both at their lowest sustained levels in decades—even as individual player performance is measured with unprecedented precision.

The result is a kind of equilibrium: fewer balls in play, more strikeouts and home runs, and an ongoing debate about whether efficiency has made the game better or simply duller.

 And yes, there is a strong correlation between what has happened in baseball and what the 3-point shot has brought to the NBA. Just as basketball front offices realized that a 3-point shot is worth 50% more than a regular 2-point shot, baseball players were strongly advised that a home run is worth a lot more than a single or walk.

Take a moment to look over the following table. I am struck by the downward trend in batting average. It sure seems like the table is calling out for a similar study using on-base and slugging percentages. I will address this issue in a future post.
Metric 1920–2000 2010s 2020s
Avg. Batting Avg. (BA) .264 .254 .245
Avg. Runs per Game (R/G) ~4.5 4.38 4.45

The 2010s and 2020s mark the first back-to-back decades of declining batting average since the 1960s. Despite this, run scoring remains relatively stable. Interesting, isn’t it? Even though there is only one batter and nine defenders, the offense-minded have concluded that home runs, even with the resultant declines in batting average and on-base percentage, are much more desirable than any other alternatives. This is a big reason why batting averages have gone down, defense and pitching have improved, and average runs per game have stayed consistent.

CONCLUSION

The numbers reveal something profound: baseball’s statistical evolution mirrors its cultural one, suggesting a fundamental constancy in its design. Each new wave of data, whether Bill James’ notebooks or Statcast’s terabytes of data, has changed how players are valued and how teams win. Yet through all of it, the sport’s core equilibrium remains intact. The league batting average, while steadily going down, still results in scoring of about 4½ runs per game—just as it did a hundred years ago. In the end, baseball adapts, but it rarely strays too far from its mathematical mean. I find that very intriguing.

The next post builds on the themes touched on in this short essay.  I want to know where all the .300 hitters have gone, and I have decided to write about it. The next post will build on the work of Stephen Jay Gould, one of the most influential and essential evolutionary biologists of the last century.  Perhaps most importantly, he was a big baseball fan who used his considerable talents to write about the sport he loved.

 

Analyzing Max Exit Velocity (2020)

Analyzing Max Exit Velocity (2020)

In baseball analytics, exit velocity—specifically, the maximum exit velocity—is a critical metric. It measures the speed at which a ball leaves the bat, providing insights into a player’s power and potential impact. I am looking at max exit velocity data from the 2020 season. This visualization offers a clear and detailed view of how max exit velocities are distributed among players and a smoothed density estimate to reveal underlying trends. My first observation is amazement at how hard these balls are being hit. It is truly astonishing.

Forget batting average; this metric is more diagnostic than many others that are typically (especially historically) referenced. If you are putting a team together, you want players who hit the ball hard. And yes, the harder the better. This line of reasoning is all about a player’s ceiling; it has nothing to do with the dribbling groundballs that find a spot between defenders. Such “seeing eye” base hits are of little predictive value.

In 2020, exit velocity data’s importance escalated as teams began using it for more refined scouting and player development decisions. This season saw an exceptionally high interest in advanced metrics, partly because of the pandemic-shortened season. This led teams and analysts to seek more data-driven insights into player performance.

I used a histogram with an overlayed density curve to visualize max exit velocity data. Here’s what each part of this plot conveys:

  • Histogram: The histogram separates the exit velocity data into intervals (bins) and shows how many players achieved max exit velocities within each range. Each bar represents a specific range of velocities and provides a quick overview of where most data points (player exit velocities) lie.
  • Density Curve: The smoothed density curve overlaid on the histogram estimates the data’s distribution, offering insights into how the data might spread beyond discrete bins. This curve helps us visualize peaks and concentration points without the rigidity of bin divisions.

Key Insights from the 2020 Max Exit Velocity Data

  1. Concentration Around the Mean: The density curve reveals a central concentration of exit velocities in the range of approximately 105-111 mph. This concentration suggests that most players in the 2020 season achieved max exit velocities within this range, indicating a consistent performance level among players regarding hitting power.
  2. Distribution Shape: The distribution is symmetric, slightly skewed towards higher velocities. This symmetry is typical in sports metrics, where most players fall near the average performance level while a few outliers achieve exceptional numbers.
  3. High-End Outliers: The density curve and histogram both suggest that a few players in 2020 achieved exceptionally high max exit velocities, reaching up to 118 mph. These outliers represent some of the league’s top power hitters, whose performances exceed the average exit velocities and pose a significant offensive threat to opposing teams. And in case you were wondering, Pete Alonzo of the New York Mets hit a ball at 118.4 mph to lead the league. If facing such a batter, I would point to first base and take my chances with the next guy. If first were occupied, I certainly wouldn’t put anything over the plate. I wouldn’t even see the line drive coming back at me.

Why This Visualization Matters

A histogram with a density curve provides a quantitative view of max exit velocity data. This visualization helps scouts, coaches, and analysts quickly assess the distribution of max exit velocities across players. The density curve also offers a smooth, continuous view of the data, making it easier to observe trends and concentrations without the constraints of bin width.

Closing Thoughts

This histogram with a density overlay captures a snapshot of the league’s hitting power, revealing the typical max exit velocities and highlighting exceptional outliers.

This exemplifies how data analytics can deepen our understanding of baseball. By looking beyond averages and focusing on distribution, we gain a richer perspective on the league’s players. Whether you’re a data enthusiast or a baseball fan, this analysis offers a powerful glimpse into the metrics driving modern baseball.

 

Exploring Arm Strength in MLB (2020-2024): A Positional Comparison

Introduction

When I think about baseball, arm strength is one of the first things that comes to mind—especially when comparing players across different positions. Whether it’s a third baseman making a quick throw across the diamond (Brooks Robinson, anyone?) or an outfielder firing a rocket from the warning track (Roberto Clemente was awesome), a strong and accurate arm can make all the difference. Recently, I dove into some data from Major League Baseball covering the years 2020 to 2024 to better understand how arm strength varies by position, and I’d like to share what I found.

Comparing Average Arm Strength Across Positions

I started by looking at the average arm strength for each position. Unsurprisingly, outfielders—particularly those in right field—have the strongest arms, while positions like first base require less power behind the throw.

This bar chart shows the average arm strength for each position (excluding catcher) in miles per hour. Outfielders (RF, CF, LF) clearly lead the way, with center fielders and right fielders consistently throwing the hardest. It makes sense: outfielders must make long throws back into the infield, often in critical situations where arm strength is key.

Are you surprised? I might have thought that shortstops would have edged out left fielders and maybe even center fielders. That said, it is close.

As always, box plots allow us to get a more granular view of the raw data. Here is what I found.

Notice the outliers among first basemen. Lots of them get very little on their throws. That is unsurprising; many players are positioned there for their offense, with defense being an afterthought.

As readers of this blog know, I have a special relationship with violin plots. Here is the same data in that form.

Once again, the poor arms of a select group of first basemen are highlighted. I consider that fact to be a big takeaway from this plot.

Infield vs. Outfield: A Clear Difference

Next, I wanted to break things down further and compare infielders’ arm strength versus outfielders. Unsurprisingly, outfielders, who cover more ground and make longer throws, generally have stronger arms.

The box plot below shows the distribution of arm strength between infield and outfield players. Outfielders not only have higher average arm strength, but the range of arm strength is more comprehensive, too. Some outfielders, particularly those in right field, can really get after it when a runner is rounding second.

I would like to tell you something interesting about this plot. Over 35 years ago, I was taught a trick (more properly, a heuristic) at Harvard University. If there is a space between the bodies of the box plots, then the data set is worthy of further exploration. If you look closely, you can see a thin space between the boxes, so I decided to investigate further to see if the differences in arm strength are statistically significant. We will get to that in a bit.

Looking for Patterns: Correlations Between Positions

Before we get to the hard-core statistics, I  wanted to explore whether there is a relationship between arm strength at different positions. For instance, do shortstops tend to have arm strength similar to that of second basemen or third basemen? To find out, I ran a correlation analysis.

This heatmap shows how arm strength at one position correlates with another. There are some interesting patterns here—positions like second base (2B) and shortstop (SS) show a strong correlation, likely because they both require quick, strong throws in the infield. The outfield positions also show high correlations with each other, which makes sense given the similar demands placed on their arms.

Here are the Statistics

The results of the one-way ANOVA test (a comparison of variance amongst means) indicate the following:

  • F-statistic: 261.67

Since the p-value is extremely small (well below the typical significance and totally arbitrary threshold of 0.05), we can reject the null hypothesis. This suggests statistically significant differences in arm strength across the different positions. In other words, the differences in arm strength are authentic and valid.

I have never done this before in my blog, but I decided to take an even deeper dive into this data set. I view this blog as more or less an introduction to what I find interesting. I don’t want to get into the weeds; many blogs and websites do that. Today, though, is different. Early this morning, I ran my 4 miles despite not wanting to get out of bed. My hip, which needs to be replaced, barked the entire time. I guess I am in a mood… Here is what I did next.

group1 group2 meandiff p-adj lower upper reject
arm_1b arm_2b 4.0267 0 2.74 5.30 TRUE
arm_1b arm_3b 8.4252 0 7.1 9.75 TRUE
arm_1b arm_cf 12.6281 0 11.3 13.92 TRUE
arm_1b arm_lf 11.1761 0 9.93 12.42 TRUE
arm_1b arm_rf 13.3679 0 12.1 14.63 TRUE
arm_1b arm_ss 8.977 0 7.6 10.32 TRUE
arm_2b arm_3b 4.3985 0 3.2 5.57 TRUE
arm_2b arm_cf 8.6014 0 7.46 9.738 TRUE
arm_2b arm_lf 7.1494 0 6.067 8.23 TRUE
arm_2b arm_rf 9.3412 0 8.2 10.45 TRUE
arm_2b arm_ss 4.9503 0 3.75 6.14 TRUE
arm_3b arm_cf 4.2029 0 3.02 5.38 TRUE
arm_3b arm_lf 2.7509 0 1.61 3.88 TRUE
arm_3b arm_rf 4.9427 0 3.78 6.10 TRUE
arm_3b arm_ss 0.5518 0.84 -0.69 1.79 FALSE
arm_cf arm_lf -1.4519 0.02 -2.54 -0.35 TRUE
arm_cf arm_rf 0.7398 0.45 -0.38 1.86 FALSE
arm_cf arm_ss -3.6511 0 -4.86 -2.43 TRUE
arm_lf arm_rf 2.1918 0 1.12 3.26 TRUE
arm_lf arm_ss -2.1991 0 -3.36 -1.037 TRUE
arm_rf arm_ss -4.3909 0 -5.57 -3.202 TRUE

These are the results from Tukey’s HSD (Honestly Significant Difference) test results that provide pairwise comparisons between arm strengths for different positions. Yeah, I know your eyes are glazing over, but bear with me. Here’s how to interpret the key columns:

  1. Group1 and Group2: These columns represent the two positions being compared. For example, “arm_1b” vs. “arm_2b” compares the arm strength of first basemen with second basemen.
  2. Meandiff: This column shows the difference in the average arm strength between the two groups. A positive number means the arm strength of the first group (Group1) is higher than the second group (Group2).
    • For example, the mean difference between first basemen (arm_1b) and second basemen (arm_2b) is 4.03 mph, meaning first basemen tend to have lower arm strength compared to second basemen.
  3. p-adj: This is the adjusted p-value, which tests the statistical significance of the difference. If this value is below 0.05, it indicates that the difference is statistically significant.
    • For most comparisons, the p-values are extremely low (0.0), indicating strong evidence that arm strength significantly differs between these positions.
  4. Lower and Upper: These are the confidence intervals for the mean difference. It provides a range within which the actual mean difference will likely fall, with a 95% confidence level.
    • For example, the confidence interval for the difference between arm_1b and arm_2b is between 2.75 and 5.31 mph, suggesting that the actual difference lies within this range.
  5. Reject: This column tells whether the difference between the two groups is statistically significant. If it says “True,” the test rejects the null hypothesis, meaning the difference between the two positions is significant.
    • In this case, “True” appears in many rows, indicating that the arm strengths differ significantly between most pairs of positions.

Key Insights

  • Significant differences: Almost all pairwise comparisons show statistically significant differences. For example:
    • Outfielders (CF, RF, LF) generally have higher arm strength compared to infielders (1B, 2B, 3B, SS).
    • Third basemen (arm_3b) also tend to have higher arm strength than first basemen (arm_1b), as shown by an 8.43 mph difference.
  • Largest differences: The biggest differences are between infield positions like first base and outfield positions like right field (arm_rf), where the arm strength difference can be over 13 mph.

Even though my hip is killing me, I feel very good about the results of this study.

Wrapping Up

So, what did I learn from all this? First, outfielders—especially those in right and center field—are in a league of their own regarding arm strength. Conversely, infielders don’t need the same power, but positions like third base and shortstop still require strong arms for those quick, long throws.

Running the ANOVA and Tukey’s test confirmed that these differences in arm strength are not random results due to the vagaries of sampling. Understanding these variations can be crucial for teams looking to optimize their defensive lineups or scout new talent.

Examining the data and seeing how arm strength varies across MLB positions was fascinating. I hope you enjoyed it. I am going to grab a beer and contemplate the disappointment of my team, the Cleveland Guardians, disastrously ending another year. Meh, what else is new?

Even More Catcher Info: 2023 Blocking Data

Catcher defense, especially the ability to block pitches, can often go unnoticed but significantly impact the game. Preventing wild pitches and passed balls can save crucial runs and give pitchers confidence to throw in the dirt when necessary. In 2023, several catchers distinguished themselves as exceptional blockers. Let’s take a look at some of the data.

This analysis uses metrics like “blocks above average,” passed balls/wild pitches (PBWP), and more to examine the best catchers at blocking pitches during the season. Below, I break down the data to highlight the elite performers.

1. Top 10 Catchers by Blocks Above Average

“Blocks above average” is a critical statistic that tells us how much better (or worse) a catcher is compared to the league average at blocking pitches. Here’s a look at the top 10 catchers based on this metric:

As shown, Sean Murphy from the Atlanta Braves leads the way with 16 blocks above average, followed closely by Alejandro Kirk and Nick Fortes. These catchers were above average in keeping pitches in front of them, saving runs for their teams.

2. Actual vs. Expected PBWP

Next, take a look at the actual vs. expected number of passed balls and wild pitches (PBWP). The scatter plot below visualizes this comparison:

Catchers whose actual PBWP is lower than expected (below the red line) performed better than average. Catchers like Sean Murphy and J.T. Realmuto are among those outperforming expectations, while others are closer to the expected values. Note that the majority of catchers were about average.

3. Blocks Above Average Per Game

Another critical metric is the rate catchers accumulate blocks above average per game. This accounts for differences in playing time and offers a normalized view of performance. Here’s a look at the top 10 catchers:

The usual suspects are once again prominent. Notice that Yainer Diaz ranked number one in the league in this critical category.

4. Comprehensive Heatmap

To better understand each catcher’s performance, I’ve compiled several blocking metrics into a heatmap. This chart includes statistics such as catcher blocking runs, blocks above average, actual vs. expected PBWP, and blocks above average per game:

The heatmap above gives a comprehensive view of the top 10 catchers. The varying shades show how these catchers compare across multiple metrics, with Sean Murphy, Alejandro Kirk, and Nick Fortes again emerging as the top performers. This heatmap allows us to see the nuances in their blocking ability, with some excelling at reducing passed balls. In contrast, others are better at blocking above average on a per-game basis.

Conclusion

Nuance and subtlety are the operative words here. Asking who was the best defensive catcher in 2023 has as complex and interesting answer. What should we value in a catcher’s defense? Which metric is more important to winning than the others? Can you settle for a below-average pop time if your catcher is brilliant at framing pitches? Lots of great questions that require thoughtful answers. Stay tuned; I will continue posting my analyses. And yes, I do intend to publish some (hopefully) thoughtful conclusions.