Offensive Friction: A Few Thoughts on Baseball Metrics Along with a Proposal

Baseball statistics (and related metrics) promise clarity. After all, why else would people go to the trouble of creating them?

Metrics offer great appeal; a long, messy season (or even career) gets compressed into a number. A player’s apparent value becomes comprehensible. For example, OPS gives us a quick offensive summary,  wOBA improves the weighting of offensive events, and wRC+ places hitters on a clean scale, with 100 being the league average. Statcast data adds another layer, telling us not only what happened, but what probably should have happened.

Each step in the above-referenced progression appears to be progress. But toward what exactly?

The problem is not that all the available metrics are inadequate. The real issue is that they are often answering different questions.

A hitter can have strong results and weak underlying indicators. Another hitter can have excellent contact quality and disappointing production. A third hitter can look ordinary overall but deliver his best moments in the highest-leverage situations. A high BABIP, a favorable run of matchups, or a few well-timed home runs can elevate a fourth.

Which hitter is better, or at least more desirable? The answer depends on what we are trying to measure.

That is why the search for one perfect offensive statistic may be ill-advised. Baseball offense is not one thing. It is a collection of related but distinct realities: production, process, context, opposition, and sustainability.

The more interesting question may not be, “Which metric is best?” The better question may be: Where do the metrics disagree?

The first way to see this is to compare actual production with expected production. If every hitter’s season were in perfect statistical balance, the points would fall neatly along the diagonal line. They do not.

Figure 1. Data for 2026 through the end of April. R² ≈ 0.658

Figure 1 compares actual production with expected production by plotting wOBA against xwOBA for each hitter. The scatter reveals a range of divergence. Players above the line have outperformed their expected results, while those below it have produced less than their contact quality and plate appearances would suggest. Mickey Moniak, for instance, sits well above the line, indicating stronger outcomes than underlying indicators might predict. In contrast, Ketel Marte and Jake Cronenworth fall below it, suggesting that their process may be better than their results to this point. The figure does not resolve which measure is more meaningful, but it makes visible the gap between them, which serves as the starting point for a proposal I will make later in the post.

 

The limits of a single number

 

One of the best offensive metrics in wide use is wRC+ because it does something very specific. It estimates a hitter’s total offensive production, adjusts for park and league factors, and places that production on an easy-to-read scale. A 120 wRC+ means a hitter has been 20 percent better than league average. A 90 wRC+ means he has been 10 percent worse than average.

That is useful and elegant. Perhaps more importantly, it is also intentionally incomplete.

wRC+ is not trying to tell us whether a hitter’s production is sustainable. It is not trying to tell us whether he has been lucky. It is not trying to tell us whether his best events came in the most important moments. It is not trying to measure the quality of the pitchers he faced in every plate appearance.

That is not a flaw. It is a design choice.

The trouble begins when we ask wRC+ to do more than it was built to do.

The same is true of expected statistics. xwOBA can tell us something about a hitter’s contact quality and plate appearances. It can suggest whether the underlying process supports his results. But xwOBA is not the same as actual value. A lineout with a high expected value may tell us something important about skill, but it did not move the runners. It did not change the scoreboard.

The expected value and the actual value are both real, but in subtle and nuanced ways.

This is where offensive analysis becomes much more interesting.

 

Production, process, and context

 

Consider three hitters.

The first hitter has a high wRC+, a high xwOBA, strong exit velocity, a reasonable BABIP, and a stable strikeout-to-walk profile. There is not much mystery here. The production and the process agree. His production most likely matches his ability.

The second hitter has a high wRC+ but a modest xwOBA. His BABIP is unusually high. His barrel rate is ordinary. His hard-hit rate is fine but not exceptional. The results are good, but the foundation is less convincing. He may still be a good and accomplished hitter, but the numbers are not speaking with one voice.

The third hitter has a poor batting average and mediocre production, but his xwOBA is strong. He hits the ball hard. His launch angle is improving. His walk rate is stable. His BABIP is low. This is the kind of player who may be better than his surface line suggests.

Analyzing the first hitter is straightforward; the real investigation begins with the second and third hitters. They are not noteworthy because one number tells us the answer. They are interesting because several numbers are arguing with each other.

That disagreement deserves to be measured.

 

Offensive Friction

 

I am calling this idea Offensive Friction (OFx).

Offensive Friction is not meant to replace wRC+, wOBA, OPS+, xwOBA, BABIP, or Statcast indicators. It is meant to sit beside them and mediate disputes.

Its purpose would be simple: Identify hitters whose offensive indicators disagree.

A low-friction hitter is easy to interpret. His production, expected production, contact quality, plate discipline, and luck indicators all point in roughly the same direction.

A high-friction hitter is harder to interpret. His numbers contain tension. One part of the profile says breakout. Another says regression. One part says unlucky. Another says limited. One part says star. Another says mirage. That tension is the signal.

In conceptual terms:

Offensive Friction = the variance among a hitter’s standardized offensive indicators

The inputs could include:

wRC+

xwOBA

BABIP

Barrel rate

Hard-hit rate

Average exit velocity

Launch angle

Walk rate

Strikeout rate

Chase rate

Context value

Each metric would be converted into a standardized score. Then we would measure how widely those scores spread apart.

A hitter whose scores cluster together would have low Offensive Friction.

A hitter whose scores scatter across the map would have high Offensive Friction.

This would not tell us who is having the better season, but it would tell us who deserves a closer look.

Once the indicators are standardized, we can ask a different question: not who has the best offensive production, but whose profile contains the most tension.

Figure 2. Data for 2026 through the end of April.

Figure 2 introduces the idea of Offensive Friction in its simplest form by ranking hitters according to the degree of disagreement across their standardized offensive indicators. Rather than asking who has been most productive, the figure asks whose statistical profile is the most internally unstable. Players at the top of the chart, such as Cedric Mullins, exhibit the widest spread across metrics, with some indicators suggesting strength and others pointing in a different direction. Others near the top, including Luis Arraez and O’Neil Cruz, show similar patterns of tension. By contrast, players further down the list have profiles in which the underlying numbers cluster more tightly together, indicating a more coherent and interpretable performance. The purpose of the figure is not to evaluate quality, but to identify where the numbers themselves are in disagreement, highlighting the players who warrant closer inspection.

  

Why disagreement matters

 

This is the part that is perhaps most interesting.

Baseball analysis usually treats disagreement as a problem to be solved. One metric says this. Another metric says that. We want to know which one is right, or at least most useful.

But maybe the disagreement itself is what we should be after.

A hitter with a 150 wRC+ and a 150 xwOBA+ is excellent, but not analytically mysterious. His results and process agree.

A hitter with a 150 wRC+ and a 100 xwOBA+ is different. His season may be productive, but the underlying indicators suggest caution. Maybe he has been fortunate. Maybe he has exploited a particular defensive pattern. Maybe he has hit a few poorly struck balls at perfect times. Maybe the expected model is missing something.

Either way, the disagreement is worth studying.

The reverse is also true. A hitter with an 85 wRC+ and a 125 xwOBA+ may be a rebound candidate. His results are poor, but the contact quality suggests something better. That does not mean improvement is guaranteed. It means the surface line may not be telling the full story.

This is where Offensive Friction could be useful. It would act as an alert system.

High friction would say: Do not stop at the leaderboard. Something interesting is happening here.

 

The equilibrium idea

 

There is another way to think about this.

Baseball performance is often moving toward equilibrium.

A hitter’s batting average may run hot for a few weeks. His BABIP may drift above his career norm. His home run rate may spike. His strikeout rate may briefly collapse. Early in a season, small samples can make ordinary players look transformed and struggling players look finished.

But over time, many numbers begin to settle.

Not always. Players do change. Swing paths change. Plate discipline changes. Strength changes. Health changes. Aging changes everything.

Still, the concept of equilibrium matters.

A hitter is close to offensive equilibrium when his production matches his process. His wOBA is close to his xwOBA. His BABIP is not wildly out of line with his batted-ball profile. His strikeout and walk rates fit his established skill set. His power output would be supported by contact quality.

A hitter is out of equilibrium when those pieces do not line up.

That disequilibrium can mean several things.

It can mean luck.

It can mean injury.

It can mean a real skills change.

It can mean a player is being misread by traditional statistics.

It can mean the model is missing something.

This is why the disagreement matters. It is not just noise. It is a clue.

 

A possible classification system

 

Offensive Friction could help classify hitters into types.

Type Profile Interpretation
I High production, high process, low friction The numbers agree
II High production, weak process, high friction Results may be ahead of skill
III Low production, strong process, high friction Better than the surface line
IV Ordinary overall profile, high leverage value Value concentrated in key moments
V Average production, average process, low friction Little mystery
VI Strong changes across some indicators, conflict across others Real change or temporary spike

This kind of framework would be more useful than another leaderboard.

It would not simply tell us who ranks first. It would tell us what kind of interpretive problem each hitter presents.

That matters because a baseball season is not just a sorting exercise. It is a diagnostic exercise.

We are not only asking who has performed well. We are asking what that performance means.

Friction tells us that the numbers disagree. The Equilibrium Gap tells us the direction of that disagreement.

 

Figure 3. Data for 2026 through the end of April.

Figure 3 places Offensive Friction alongside overall production, allowing us to see not just how well a hitter has performed, but how stable or interpretable that performance is. The horizontal axis measures the degree of disagreement among a player’s underlying indicators, while the vertical axis reflects his overall offensive output. The quadrant structure provides a simple framework: hitters in the upper left combine strong production with internal consistency, while those in the upper right are producing at a high level but with profiles that contain tension, making them less certain going forward. The lower right quadrant is especially interesting, as it captures players with weak results but high friction, suggesting that their underlying indicators may point to something better than the surface line. Cedric Mullins, for instance, falls into this region, pairing low production with a highly unstable profile. Meanwhile, players like Luis Arraez and O’Neil Cruz occupy the high-friction, higher-production space, where strong results coexist with less agreement beneath the surface. The figure does not resolve which interpretation is correct, but it identifies where the most interesting analytical questions reside.

 

The philosophical problem

 

Every baseball metric contains a philosophy.

OPS values simplicity.

wOBA values proper event weighting.

wRC+ values context-neutral offensive production.

xwOBA values underlying process.

WPA values game situation and timing.

BABIP points us toward luck, contact profile, and defensive interaction.

None of these numbers is the whole truth. Each one chooses a version of its specific truth.

That is why one-number arguments can become misleading. A player can be more valuable than he is skilled. He can be more skilled than he has been productive. He can be productive in a way that is unlikely to continue. He can be unlucky without being good. He can be lucky and still be excellent.

The categories overlap, but they are not identical.

This is why I am prososing the idea of Offensive Friction. It does not pretend to solve all of this. It begins by admitting the complexity.

The goal is not to flatten the hitter into one final answer.

The goal is to identify where the narrative bends or even breaks.

 

What this would add

 

A metric like Offensive Friction would be especially useful early in the season.

In April and May, leaderboards are unstable. A few bloop hits can inflate a batting average. A few warning-track outs can suppress a slugging percentage. One series in a favorable ballpark can distort the picture. One bad week can make a good hitter look lost.

A friction model would help distinguish stable from unsettled performance.

It could identify:

  • players whose hot starts are supported by process,
  • players whose hot starts look fragile,
  • players whose poor results hide strong underlying skill,
  • players whose surface numbers and expected numbers are beginning to diverge,
  • players whose profiles have genuinely changed.

That is more interesting than simply ranking hitters. It gives us a way to ask better questions.

The same friction score can come from very different profiles. A radar view helps show why one high-friction player may be a mirage, while another may be a hidden riser.

 

 

Figure 4. Data for 2026 through the end of April.

Figure 4 shifts the focus from outcomes to structure. Each polygon represents a hitter’s standardized offensive profile across several underlying indicators, allowing us to see not only how good a player has been overall, but also how his components align or diverge. A more balanced, compact shape suggests agreement among metrics and a profile closer to equilibrium. A jagged or uneven shape reveals tension, where certain indicators pull in different directions. Cedric Mullins, for example, displays a visibly uneven profile, with strengths in some areas offset by weaknesses in others, a hallmark of high friction. Ketel Marte shows a more coherent structure, with metrics that move together more consistently. Jake Cronenworth sits between these extremes. The purpose of the figure is not to rank hitters, but to reveal the internal shape of their performance, highlighting where the underlying indicators agree and where they do not.

 

  Conclusion

 

The future of offensive analysis (and defensive and pitching as well) may not be another statistic that claims to replace the old ones. It just might be a model that explains why the old ones disagree.

That is the larger lesson. Baseball offense (and defense) is not a single reality. It is actual production, expected production, contact quality, plate discipline, timing, opposition, luck, and sustainability. Each metric captures part of that structure. None captures all of it.

So maybe the most interesting hitters aren’t always the best. Maybe they are the hitters whose numbers have not yet settled into agreement.

That is where the analysis should begin. Because sometimes the story is not found in the statistics themselves. Sometimes the story is found in the friction between them.

 

 

The Shape of the American League (so far): A Three-Dimensional Look at Team Strength

Wins and losses tell us what has happened. Composite metrics can help explain why.

Using offensive, pitching, and defensive data through May 12, 2026, I combined multiple American League team metrics into a standardized z-score framework. Each category was normalized relative to league averages, allowing offensive production, run prevention, and fielding quality to be evaluated on the same scale.

Rather than relying on a single statistic, this approach attempts to measure organizational balance. Teams receive positive scores when they perform above league average and negative scores when they fall below it. For pitching categories such as ERA and WHIP, lower values were inverted so that stronger performance always resulted in higher z-scores.

The result is less a standings table and more a multidimensional map of each team’s underlying quality.

Figure 1: Composite AL Team Strength Through May 12, 2026

Yeah, the Yankees stand alone. They are out there by a large margin.

New York’s profile is unusually complete. They combine the league’s strongest offensive output with elite pitching performance, producing separation that becomes obvious once the categories are standardized. The offensive metrics are overwhelming enough on their own, but pairing them with the AL’s best ERA and WHIP creates a profile that resembles something other than just a hot start.

Perhaps most importantly, the Yankees are not merely winning through one dominant dimension. Many early-season contenders are sustained by either explosive offense or temporary pitching overperformance. New York grades strongly in both simultaneously. That is a big deal.

The Astros occupy a fascinating second tier. Houston’s offense remains extremely dangerous, leading the league in batting average while ranking near the top in slugging and run production. Yet the pitching profile is significantly weaker than expected, especially relative to prior Astros teams. Their overall placement illustrates how overwhelming offensive production can partially compensate for poor run prevention, at least over a 40-game sample.

The Athletics may be the most surprising analytical team in the league so far. Their composite score benefits from a quietly balanced structure. They field exceptionally well, avoid major pitching collapse, and generate enough offense to remain consistently above average across categories. This is not a team built around dominance. It is a team built around the absence of glaring weakness.

Cleveland fits a similar pattern. The Guardians do not dominate the league in any single category, but they remain consistently competitive across all three phases of the game. Their strong fielding profile, solid strikeout numbers, and competent offense produce one of the most stable composite structures in the American League.

Seattle grades better analytically than its current record suggests. The Mariners continue to pair strong pitching with quality fielding, even though the offense remains uneven (at best). Their underlying structure implies a team that could improve substantially if the bats normalize.

Meanwhile, Tampa Bay presents one of the more interesting contradictions in the league. The Rays possess one of the best records in the AL, yet their composite z-score profile remains only modestly above average. This may indicate sequencing luck, strong leverage performance, or simply an ability to maximize close games. Interestingly, this has long been a recurring characteristic of Tampa Bay baseball.

At the bottom of the rankings sit Baltimore and Boston. Neither team displays a single catastrophic weakness. Instead, the issue is cumulative mediocrity. Once standardized, multiple slightly below-average categories compound into significantly negative total scores. The Orioles, in particular, have struggled to prevent runs while failing to separate offensively from the league middle.

This raises an important analytical point. Baseball teams are often discussed in singular terms: “great offense,” “elite rotation,” “bad defense.” But actual team quality emerges from interaction effects across systems. Strong fielding can amplify pitching. High-strikeout staffs reduce defensive volatility. Power-heavy offenses can partially absorb bullpen instability.

The z-score approach attempts to capture some of that interconnected structure. As you know, it is a favorite strategy of mine.

No model perfectly predicts future outcomes, especially in May. Small samples remain volatile. Injuries reshape rosters quickly. Regression arrives unevenly. Yet early-season standardization can still reveal organizational identity. Some teams already appear structurally coherent. Others appear fragile despite respectable records.

And at the moment, one conclusion appears difficult to avoid: The Yankees are not simply leading the American League; they are performing as a dominant team would.

 

The Geography of Attention

The internet, for good reason, creates the illusion of placelessness. We often speak about online spaces as though geography has somehow dissolved into pure abstraction (perhaps because it has). A post is published, indexed, shared, and consumed in a domain that appears detached from physical space (because it is). Yet audiences still necessarily cluster geographically. Attention still has borders and centers.

At the same time, even relatively small blogs (and unread blogs, like mine) can develop surprisingly international footprints. A post written in Ohio may quietly reach readers in Germany, Singapore, South Africa, Peru, or Hong Kong. The resulting distribution is happily uneven but not random.

Recently, I examined the geographic distribution of my own readership data. The results revealed a familiar but fascinating structure: a dominant national core followed by a remarkably long international tail. Nearly seventy percent of my readers came from the United States, but the remaining audience is dispersed across dozens of countries spanning six continents. Astonishing. I never would have believed I could have such reach.

Figure 1. Geographic distribution of readership by country.

Figure 1 presents the raw audience distribution. Unsurprisingly, the United States dominates the dataset with over 5,000 readers. The next tier includes Germany, China, the United Kingdom, Australia, and Canada. Beyond that lies a progressively descending series of countries contributing smaller but still meaningful readership totals.

At first glance, the international component appears minor. This impression, however, is partly a problem of scale. Extremely large values visually compress smaller values. Once one category becomes overwhelmingly dominant, the rest begin to resemble statistical background noise even when they contain important information.

This is a common problem in data analysis. Large systems frequently obscure their own internal structure. There are ways to deal with this.

To better examine the distribution, it is useful to transform the data logarithmically.

Figure 2. Geographic readership distribution displayed on a logarithmic scale.

The logarithmic transformation substantially changes the interpretation. Germany, China, the United Kingdom, Australia, and Canada now emerge as a distinct secondary layer rather than simply disappearing beneath the gravitational pull of the United States. The international audience becomes easier to conceptualize as a real structure rather than a residual category.

Interestingly, this type of distribution appears repeatedly throughout complex systems. City populations, word frequencies, citation networks, website traffic, and social media engagement patterns often exhibit similar heavy-tailed behavior. A small number of nodes dominate the system while a long sequence of progressively smaller contributors extends outward indefinitely.

The resulting geometry is neither symmetrical nor random. It reflects the cumulative effects of language, search algorithms, network diffusion, cultural proximity, and simple historical contingency.

English-language content naturally concentrates within the United States. Yet once ideas begin moving internationally, the pathways become less predictable. Germany appearing second in the distribution may reflect academic interest, search indexing patterns, algorithmic recommendation behavior, or merely the accidental accumulation of links over time. The same is true for Singapore, South Africa, or Peru. Online diffusion contains both structure and randomness simultaneously.

The cumulative distribution makes this even clearer.

Figure 3. Cumulative readership concentration by ranked country.

Figure 3 demonstrates how quickly the audience accumulates. The United States alone accounts for nearly seventy percent of total readership. After that initial jump, however, accumulation slows dramatically. The remaining percentages require dozens of smaller national audiences contributing incrementally to the overall total.

This is the hallmark of a long-tail distribution.

The phenomenon is philosophically relevant because it reveals the coexistence of centralization and dispersion within modern information systems. Attention is highly concentrated, yet ideas still scatter globally in surprisingly diffuse ways. A relatively small intellectual project can nonetheless establish faint statistical traces across an enormous geographic landscape.

Perhaps most important, the smallest numbers matter the most conceptually. One reader in Tanzania. One in Iceland. One in Yemen.

Individually insignificant from a statistical perspective, collectively they reveal something larger about the architecture of the modern internet. Ideas no longer move outward in neat geographic circles. They scatter unevenly, unpredictably, and sometimes almost at random.

The geography of attention is neither flat nor centralized. It is as fascinating as it is nonlinear.

Defensive Ecosystems Behind the Plate: How Good is Patrick Bailey?

I grew up in, and still live in, Northeast Ohio. I have long suffered from rooting for the Indians, now known as the Guardians. It hasn’t been a pleasant journey. The last time we won a World Series was 1948, and I do not see another victory on the horizon. So it goes.

I woke up early this morning after hearing we traded for Patrick Bailey, a two-time Gold Glove-winning catcher. I knew Bailey was good. I decided to see just how exceptional he is behind the plate. I used 2025 data for the following study.

Baseball analysis often reduces catchers to a handful of familiar metrics. Framing. Pop time. Arm strength. Caught stealing percentage. Blocking runs. Yet the position itself resists simple categorization. Some catchers suppress the running game through elite exchanges and quick releases. Others survive on receiving skills and pitch presentations. A few manage to combine multiple defensive strengths into unusually complete profiles.

This analysis attempts to move beyond simple rankings by examining the structure of catcher defense. Rather than asking merely who the best defensive catchers were, I wanted to explore a deeper question: Are there distinct defensive ecosystems among modern MLB catchers? Note that this post does not consider how a catcher handles a pitching staff.

To investigate this, I combined multiple publicly available defensive datasets covering:

  • blocking
  • throwing
  • framing
  • exchange time
  • pop time
  • arm strength
  • caught stealing metrics
  • related subcomponents

Every metric was standardized using z-scores to allow catchers to be compared on a common scale (something I often do). From there, the project unfolded in several stages:

  1. creation of composite defensive scores
  2. principal component analysis (PCA)
  3. hierarchical clustering
  4. dendrogram construction
  5. an “Unusualness Index” measuring statistical distance from the average catcher profile

The result was less a ranking exercise and more an exploration of what can be termed defensive geography. And yes, all this work was done because I was curious about our new catcher.

 

Building the Defensive Landscape

 

The first step was to construct an overall defensive score across three broad categories: blocking, throwing, and framing. Each category itself was built from multiple underlying z-scored metrics.

 

 

This approach allowed catchers to be evaluated across multiple dimensions simultaneously rather than through isolated statistics. But even this composite score quickly revealed an important limitation: Two catchers could arrive at nearly identical defensive totals through very different defensive pathways. That observation became the central motivation for the clustering analysis.

 

PCA and Defensive Geography

 

Principal Component Analysis compresses high-dimensional data into a smaller number of interpretable axes. I don’t know about you, but I find it very difficult to think in anything more than two dimensions.

In this dataset:

  • PC1 explained approximately 36% of total variance
  • PC2 explained roughly 19%

Together, they created a two-dimensional map of modern catchers’ defensive strategies.

The PCA visualization immediately suggested that catchers naturally separate into different defensive archetypes rather than forming a single continuous population. The discussion does get nuanced.

Some clustered around framing skill. Others around throwing and athleticism. A handful appeared unusually isolated. Most notably, Patrick Bailey emerged not only as one of the strongest overall defenders in the dataset, but also one of the most statistically unusual.

Figure 1: PCA Map of Defensive Catcher Profiles

The clusters in the PCA plot represent groups of catchers with similar overall defensive structures rather than similar rankings. A catcher can therefore occupy the same broad defensive tier as another player while still existing within an entirely different defensive ecosystem. This result somewhat surprised me.

 

Hierarchical Clustering and Catcher Archetypes

 

To explore those ecosystems further, I applied Ward hierarchical clustering to the standardized defensive profiles.

Unlike simple rankings, hierarchical clustering groups players according to the shape of their statistical profiles. I find this to be an interesting way to look at the data. For example, Patrick Bailey and Alejandro Kirk finished with very similar overall defensive scores. Yet the clustering analysis separated them because they appear to provide defensive value through different skill combinations.

Bailey profiles as a rare hybrid:

  • elite framing
  • strong throwing traits
  • positive blocking metrics

Kirk, meanwhile, appears more specialized toward:

  • framing
  • blocking
  • receiving skill

The dendrogram reveals these structural differences visually.

Figure 2: Dendrogram of Top Defensive Catchers

Several major catcher ecosystems emerged:

Cluster 1: Elite Defensive Hybrids

These catchers combined strong framing with excellent throwing skills. Representative players:

  • Patrick Bailey
  • Austin Hedges
  • Tyler Heineman

Cluster 2: Athletic Throwing Specialists

These catchers leaned heavily into:

  • arm strength
  • exchange speed
  • suppression of the running game

Representative players:

  • J.T. Realmuto
  • Endy Rodríguez

Cluster 3: Balanced Traditional Catchers

The statistical center of gravity for the position.

Competent across categories without extreme specialization.

Representative players:

  • Gabriel Moreno
  • Christian Vázquez

Cluster 4: Offense-First or Declining Defenders

Catchers whose defensive metrics trended negatively despite offensive value or prior reputations.

Representative players:

  • Salvador Perez
  • Yainer Diaz

Cluster 5: Extreme Outliers

In this case, Agustín Ramírez emerged as a statistically isolated profile unlike any other catcher in the dataset.

 

The Unusualness Index

 

One of the more interesting outputs of the project was the creation of an “Unusualness Index.” Conceptually, it measures how far a catcher’s defensive profile lies from the league-average catcher.

Mathematically:

Large values indicate:

  • rare defensive combinations
  • extreme strengths or weaknesses
  • hybrid skill profiles
  • statistical isolation

Interestingly, some of the most unusual catchers were not necessarily the best overall defenders. That distinction may be one of the most important findings in the study. Elite value and statistical uniqueness are related, but they are not identical concepts.

 

Figure 3. Most unusual catchers in 2025.

 

The Patrick Bailey Question

 

Perhaps the most fascinating result involved Patrick Bailey. Bailey ranked near the top of the defensive leaderboard while also appearing among the most unusual defensive profiles in the dataset. That combination is rare.

Most players become unusual because they possess one overwhelming specialization or weakness. Bailey appears unusual because he performs unusually well across multiple difficult defensive dimensions simultaneously.

The clustering analysis, therefore, suggests that Bailey is not merely “good.” He may represent a relatively uncommon defensive archetype altogether. The man is special.

Figure 4. Most accomplished defensive catchers in 2025.

 

Final Thoughts

 

Traditional baseball analysis often searches for single metrics capable of defining defensive quality. But catcher defense appears fundamentally multidimensional.

There is no single pathway to defensive value behind the plate.

Some catchers thrive through receiving.
Others through athleticism.
Others through balance.
A few through genuinely rare hybrid profiles.

The PCA map and dendrogram reveal something that simple rankings cannot: catcher defense is not a ladder. It is an ecosystem. And within that ecosystem, certain players appear to occupy unusually isolated terrain. I can’t wait to see Patrick Bailey in a Guardians uniform.

 

The Shape of a Decision: MLB Catcher Stances

Back in the 1970s, I had a small black-and-white TV in my bedroom. I would watch baseball games late into the night as I fell in and out of sleep. I was, of course, a Cleveland Indians fan, so those games were at the top of my list. If the Indians were off, and I switched to channel 35, and the antenna was just so, I could get Pittsburgh Pirates games. I watched a lot of their games as well.

One of my best memories of those Pirates games is watching the great Manny Sanguillén catch. He was unusual in that he would drop a knee while waiting on the pitch. I also recall him sticking the other leg way out to the side so that he could get lower in his stance. I don’t recall any other catchers dropping to a knee in that era. He was a great player, and he remains one of my all-time favorites.

As I watch games today, I am seeing all the catchers drop to a knee as the pitcher winds up. Certainly, this has to be worthy of a post, right? As it happens, I found some interesting stuff. The following two figures tell part of the story, and it is a story worth hearing.

The first figure is straightforward. The percentage of pitches received from a one-knee stance rises from 23% in 2020 to 96% in 2026. What begins as a minority behavior becomes, in short order, the default condition of the position. By the end of the period, the alternative has nearly disappeared.

The second figure complicates this.

Instead of levels, it shows change. Year-to-year percentage growth in one-knee usage spikes dramatically early, then declines just as quickly. The initial jump exceeds 100%. After that, the rate of increase falls, first sharply, then gradually, until it approaches zero.

Taken together, these figures describe something more precise than simple adoption. They describe timing.

Interestingly, the decision appears to occur well before the endpoint. The first figure suggests a continuous rise through 2026. The second suggests that the meaningful shift happens earlier, closer to 2021–2023. After that, the system is no longer deciding. Decisions have been firmed up, and consolidation has taken place.

Perhaps most importantly, this pattern aligns with a familiar structure. Early adopters move aggressively, often extracting outsized value. My bet is that this is due to pitch framing. The rest of the system follows, not because the marginal gains remain large, but because the uncertainty has been resolved. Once that threshold is crossed, the behavior spreads regardless of diminishing returns.

This raises a natural question. If the rate of change collapses while the level continues to rise, what is driving the final stages of adoption?

The answer is likely institutional rather than individual. At some point, the technique ceases to be optional because the data says that it is the right thing to do. Consequently, it becomes embedded in instruction, in development, and (most importantly) in expectation. Young catchers entering the league are not choosing the one-knee stance. They are inheriting it.

This is where the first figure can mislead if taken alone. A rising line suggests ongoing discovery. The second figure suggests the opposite. Discovery is front-loaded. What follows is replication.

There is also a subtle implication for evaluation. If most of the informational gain occurs early, then later adopters are operating in a different environment. They are not testing a hypothesis. They are implementing a standard. Any performance differences observed in the later years must therefore be interpreted within a system that has already converged.

That convergence is the quiet endpoint of the process. By 2025 and 2026, the rate of change is minimal. Not because the idea has failed, but because it is clearly the right thing to do. The system has reached equilibrium.

And so, the two figures resolve into a single observation. The transformation of catching technique did not take six years; it took two or three.

Now comes the surprising part. I wondered which knee should be dropped. I know all the catchers are right-handed, so handedness is not a consideration. Fortunately, I found raw data on this. Take a look at the following figure. I find it fascinating.

The figure reveals a subtle but meaningful asymmetry in how the one-knee stance has been adopted. Early in the period, the right-knee-up configuration is more common, but over time, the balance shifts decisively toward the left-knee-up orientation. By 2026, the split is no longer close, with the left-knee-up approach clearly dominant. Interestingly, this suggests that the evolution of the stance is not simply about going to one knee, but about settling into a preferred directional setup. The change appears gradual rather than abrupt, implying that once the broader adoption decision was made, the league continued to refine how the stance is executed rather than whether it should be used at all.

How about that? As of now, I have no idea why the preference shifted. The only thing I know is that it certainly was data-driven. My best guess is that it has something to do with giving the umpire a certain perspective when pitches are being framed. That look being the one the catcher wants the ump to have.

I am open to ideas. If you like, let me know what you think in the comments. This post certainly is fodder for a spirited discussion.

 

 

More on AL First Basemen 4 23 26

Here we are, a day later and (hopefully) a little wiser. I have more to say about yesterday’s post on those pesky first basemen. I can now tell you exactly how lucky or unlucky all of them have been so far this season.

BABIP, batting average on balls in play, is the metric used to determine the role of luck in a player’s offensive output. Here is the simple equation:

 

Where:

  • H = Hits
  • HR = Home Runs
  • AB = At-bats
  • K = Strikeouts
  • SF = Sacrifice flies

This equation shows what happens after contact is made and a ball is put in play. League-wide, and this has been true for a long time, players hover around .300. A BABIP of .300 is the de facto gravitational center for all players.

Take a look at this:

This is how we want to read the data. Guerrero (.378), Rice (.378), and Kurtz (.364) have been extremely lucky so far this season. I do not think they can maintain BABIPs that strong much longer. They will certainly regress to the league mean of ~.300.

On the other hand, Naylor (.213) and Pasquantino (.169) have been very unlucky. Those line drives are being hit right at fielders, and the hard-hit ground balls are not finding any holes. Both men should see their offensive production increase as their BABIPs work their way toward .300.

What about our man, Kyle Manzardo? At .317, he has not been unlucky at all. Not only has he not been unlucky, but he has likely benefited from slightly favorable outcomes on balls he has put in play. This means that balls are finding gaps at a slightly elevated rate, defensive positioning or variance is working in his favor, and there is no immediate signal of suppressed results due to bad luck. Once again, I was a bit surprised by this.

Manzardo’s BABIP implies that there should be downward pressure on his offensive production. Remember the graph from the last post? This does not bode well for a player whose output has been last in the league at his position. I am curious to see how this plays out.

 

AL First Basemen Thru 4 22 26

I got up early today and decided to take a look at what has been happening in the American League with all the first basemen. I was inspired after I heard that Kyle Manzardo was having a very unlucky season at the plate. It happens, and there are very good metrics out there to measure things like luck.

I want to take a glance at what is going on about 25 games into the 2026 season. The first figure shows which players have the most similar production. Notice that Manzardo’s offensive output is most closely related to that of his old teammate, Josh Naylor. Both are off to very slow starts.

The players on the right-hand side of the figure should come as no surprise. What might give you pause is the next figure. Instead of clustering players based on similar offensive numbers, I decided to analyze them only by the categories that help their teams win. In other words, I eliminated things like strikeouts and double plays that the player might have grounded into.

Our man Manzardo is all alone at the bottom of the list. I believe he is too good a player to remain there. The same is true for Naylor, and all the players with low rankings will probably increase their production as the weather warms.

If you study the chart, you will see that Ben Rice has been, by a wide margin, the most productive offensive first baseman in the American League so far this year. I was a bit surprised by this. I will keep an eye on things and report back throughout the season. Depending on the will of the Muses in control of The Boys of Summer, I might expand my horizons to every position in both leagues.

 

52!

I woke up this morning and decided I was going to do something never before seen in the history of the universe. I started to solve the Riemann Hypothesis, but then thought better of it. Turns out, it is far easier to shuffle a deck of playing cards.

How many different possible combinations are there? More than a couple. Here is the answer in scientific notation:

Here is the number written out:

 80,658,175,170,943,878,571,660,636,856,403,766,975,289,505,440,883,277,824,000,000,000,000

Let’s change that number to seconds. If you were to shuffle a standard deck of cards every second, starting at the Big Bang and continuing until today, you would not make a dent in the number. You would need this many years to exhaust all possible combinations:

The age of the universe in years:

So…yeah. Each time you properly shuffle a deck of playing cards, you are creating a sequence that most certainly has never been seen before and will never be seen again. 

Notice that in this instance, uniqueness emerges from permutation. There is nothing special about the sequences, no narrative that can give them meaning. It is simply about the arrangement of playing cards. You might find that astonishing.

The Delightful Louise Stonham

I was in the middle of writing another math post when I came across a “tok” on my phone (OK, a YouTube short). Mississippi State University has a young track and cross country runner named Louise Stonham who keeps randomly showing up on my feed. Today, I realized that we are kindred spirits.

Ok, so well over four decades ago, I was a D1 athlete. I have a letter and a mug somewhere in my house to prove it. For the last several weeks, it has been apparent to me that Ms. Stonham is very proud of her status as a D1 athlete, as she should be. It is a big deal. Not everyone can claim such an honor.

I always smile when she shows up on my phone, dancing and bouncing around like a promising young person with her entire life in front of her. What a joy she is. Today I saw something a little different.

Her latest post shows her running on the track. I am guessing it is a 10k race. The caption reads “When the hardest part of running isn’t the running itself… It’s battling the voice in your head.”

Truer words have never been spoken. I never have trouble getting out to run, the problem always comes when I am out there and the voice, that substantial and inevitable voice, tells me to slow down or cut the workout short. It happens all the time to me now, just as it has for decades.

I usually run 5 miles a day. Today, I gave in to the voice and stopped after 4 miles. “Stop…slow down…you have been running too hard…you are old…you don’t want to get hurt…you should have taken a day off…” You get it. This happens to me almost every day.

I just want to thank Ms. Stonham for letting me know I am not alone in my battle with myself. This is the first time I have ever heard another runner talk about this issue, a problem that has plagued me throughout my life.

Well, young lady, I am rooting for you. As I sit here in mythical Iriquois County, Ohio, I only wish you fulfillment and happiness. Do your best to never lose your joy for running or your passion for life. You are an inspiration. And, good grief, whatever happens, do not stop posting!