BASEBALL – Random Thoughts from a Nonlinear Mind

Catchers & Offense & Stats… Oh My

Catcher is the hardest offensive position to evaluate cleanly.

The physical burden is different. The playing-time patterns are different. The defensive responsibilities are different. Catchers do not simply stand in the field and wait for the next ball in play. They handle pitchers, absorb foul tips, control the running game, frame the strike zone, call pitches, manage fatigue, and carry a level of defensive responsibility that no other position quite matches.

That makes an offense-only catcher study both useful and limited. Useful because it lets us isolate hitting. Limited because it does not measure the full value of the position.

So the question here is narrow: Who was the most dominant offensive catcher of his time?

Not the greatest catcher overall. Not the best defensive catcher. Not the best all-around catcher.

The best offensive catcher.

Methodology

Using the Lahman Database, I identified catcher seasons through Appearances.csv. A player-season qualified if the player had:

At least 50 games at catcher

At least 300 plate appearances

For each qualified catcher-season, I calculated six offensive measures:

OBP

SLG

HR per PA

BB per PA

Runs per PA

RBI per PA

Each category was converted into a z-score within that season’s catcher peer group. The season score was the sum of those six z-scores.

Season Score =

OBP z + SLG z + HR/PA z + BB/PA z + R/PA z + RBI/PA z

Partial seasons were weighted by playing time, with full credit beginning at 600 plate appearances.

The purpose is to measure distance from the catcher norm. A catcher in 1932 is compared to other catchers in 1932. A catcher in 1997 is compared to other catchers in 1997. A catcher in 2025 is compared to other catchers in 2025.

The model asks a simple question: How far above ordinary catcher offense did this player stand?

Figure 1: Career Offensive Dominance

The career ranking gives us a clear, yet not overwhelming, winner.

Mike Piazza finishes first with a career peer-adjusted offensive score of 102.9. Johnny Bench is second at 96.8. Jorge Posada is third at 78.4, followed by Mickey Cochrane, Yogi Berra, Carlton Fisk, Bill Dickey, Gabby Hartnett, Mickey Tettleton, and Ted Simmons.

The top two are not surprising. Piazza has long been treated as perhaps the greatest hitting catcher ever. Bench, because of his power and overall stature, is the natural counterargument.

What is interesting is the size of the gap. Piazza wins, but he does not run away from Bench. This is not Ruth in right field. This is not Schmidt at third base. This is a real contest.

Piazza’s advantage comes from sustained offensive separation. He was consistently an exceptional hitter for a catcher. Bench was not far behind, and his all-around historical reputation remains larger because defense is outside this model.

The first major conclusion is therefore careful: Piazza wins the offense-only career argument. Bench remains the broader catcher argument.

Figure 2: Best Seven-Season Peaks

The seven-season peak ranking strengthens Piazza’s case.

Piazza leads with a peak-seven score of 74.4. Bench follows at 67.1. Mickey Cochrane is third at 59.8, Jorge Posada fourth at 58.9, and Mickey Tettleton fifth at 55.2.

This figure matters because a career ranking can sometimes reward longevity more than dominance. Piazza does not merely win by accumulation. He also has the strongest seven-season offensive peak among catchers in the study.

Bench is again close. Cochrane and Posada both grade very well. Tettleton is an especially interesting name because his career as a catcher was shorter, but his peak offensive performance was unusually strong.

The top of the peak chart says something important about Piazza:

He was not just consistent. He was repeatedly elite.

Figure 3: Career Value Versus Peak Dominance

The scatterplot shows the catcher field clearly.

Piazza sits in the upper-right corner. Bench is close, but slightly behind on both career and peak. Cochrane and Posada form the next major group. Yogi Berra, Carlton Fisk, Bill Dickey, Gary Carter, Mickey Tettleton, Roy Campanella, Gene Tenace, Ted Simmons, and Gabby Hartnett fill out the high-value region.

This figure is useful because it visualizes the structure of the debate. Piazza and Bench separate from the field, but not by the same kind of distance we saw with Ruth among right fielders. Catcher offense is more compressed. The demands of the position make sustained offensive dominance harder to maintain.

The chart also shows why Posada deserves attention. He ranks surprisingly high in this offense-only framework. That does not mean he was a greater catcher than Berra, Fisk, Carter, or Dickey overall. It means his bat, measured against catcher peers, was more valuable than his usual historical reputation suggests.

This is one of the strengths of the method. It can recover players whose offensive value may be partly hidden by broader reputation.

Figure 4: Best Individual Offensive Seasons

The single-season leaderboard produces the most surprising result of the study.

The top season was Cal Raleigh’s in 2025, with a score of 13.5. Mickey Tettleton’s 1991 season is second at 13.2, followed by Darren Daulton in 1992, Mike Piazza in 1997, Joe Torre in 1966, Johnny Bench in 1972, Roy Campanella in 1953, Piazza in 2001, Mickey Cochrane in 1932, and Joe Mauer in 2009.

Raleigh’s 2025 season stands out because of the combination of power and catcher context. A 60-home-run catcher season is not merely impressive in raw terms. It is almost structurally disruptive. The model captures that disruption by comparing Raleigh to other catchers in the same season.

But one season does not create the career case. Piazza and Bench remain ahead historically because they repeated high-value catcher offense across many seasons. Raleigh’s result is a peak result. It deserves attention, but it is not the same as a career argument.

This figure gives the post its modern hook:

Piazza wins the career study, but Raleigh owns the single most dominant catcher season in the model.

Figure 5: Piazza Versus the Best Non-Piazza Catcher

Figure 5 compares Piazza to the best non-Piazza catcher in each season of his qualified catcher career.

The pattern is revealing. Piazza was not always the top offensive catcher in a given season, but he spent much of his prime at or near the top of the positional leaderboard. His strongest years, especially in the late 1990s and early 2000s, show repeated separation from the catcher norm.

The later years show a decline, as expected. Catcher aging is difficult. The position extracts a cost. What matters is the prime period. Piazza’s best seasons were not isolated. They were part of a sustained offensive identity.

This is where the career score becomes meaningful. It is not simply adding numbers. It is adding repeated seasons of distance from the average catcher.

Piazza kept creating that distance.

Figure 6: Balanced Offensive Greatness

The balanced score combines career value and peak value over seven seasons.

Piazza again finishes first, with a balanced score of 177.3. Bench is second at 163.9. Posada is third at 137.3, followed by Cochrane, Fisk, Berra, Dickey, Tettleton, Gary Carter, and Ted Simmons.

This may be the cleanest single-number summary of the offense-only catcher question. It rewards both longevity and dominance. Piazza wins both categories.

Bench remains close enough that the all-around debate is still alive. In fact, if defense were added, Bench would probably become much harder to beat. But offense alone gives Piazza the edge.

The balanced ranking also raises a useful historical point. Posada and Tettleton look better in this framework than many traditional catcher rankings might suggest. Their bats separated from the catcher baseline in ways that matter.

That is the value of positional peer adjustment. It does not simply repeat conventional memory.

Figure 7: Offensive Component Profile

The component profile shows how the top catchers built their value.

Piazza’s profile is built around slugging, home-run rate, RBI production, and strong OBP. That fits the historical picture. He was not merely a catcher who hit well. He was a middle-of-the-order hitter who happened to play catcher.

Bench’s profile is more power-and-run-production driven. His HR/PA and RBI/PA components are especially strong, while his OBP component is lower than Piazza’s. That distinction matters. Bench was a great offensive catcher, but Piazza’s profile is more complete in this offense-only model.

Posada’s profile is different. His walk rate component is outstanding, and his OBP helps drive his high ranking. Cochrane also shows strong OBP and run-scoring value. Berra and Fisk lean more toward slugging and run production. Dickey and Hartnett fit the earlier power-catching tradition.

This figure makes one of the key points of the study clear:

Catcher offense has more than one shape.

Piazza is the best overall offensive shape. Bench is the power-catcher archetype. Posada is the patience-and-OBP surprise. Cochrane is the high-OBP historical great. Tettleton is the concentrated modern peak.

Figure 8: Dendrogram of Top Offensive Catchers

The dendrogram clusters the top 15 catchers by offensive shape rather than by total score.

Piazza and Bench cluster together, which makes sense. Both were elite power catchers, though Piazza’s OBP advantage gives him a slightly different profile. Fisk and Berra also cluster near that general power-production family.

Another branch groups players like Roy Campanella, Gary Carter, Brian McCann, Ted Simmons, Gabby Hartnett, Bill Dickey, and Lance Parrish. These players produced value through varying mixtures of power, RBI production, and longevity.

A different branch includes Mickey Cochrane, Jorge Posada, Gene Tenace, and Mickey Tettleton. That group is especially interesting because it reflects patience and on-base value. Posada, Tenace, and Tettleton all benefit from walk-heavy profiles. Cochrane brings an earlier version of high-OBP catcher offense.

The dendrogram helps explain why the catcher ranking is so interesting. Piazza wins, but he does not win because there is only one way to be an offensive catcher. He wins because his version of catcher offense combined peak, consistency, and middle-order force.

The Bench Problem

No catcher study can avoid Johnny Bench.

In this offense-only model, Bench finishes second. That is not a criticism. It is a sign of how strong Piazza’s bat was. Bench’s case as the greatest catcher ever remains powerful because this model excludes defense. It does not count game-calling, throwing, handling pitchers, or the defensive burden of the position.

If the question is:

Who was the greatest all-around catcher ever?

Bench may still be the answer.

But if the question is:

Who was the greatest offensive catcher relative to his peers?

The answer from this study is Piazza.

That distinction should be kept clear.

The Cal Raleigh Note

Cal Raleigh’s 2025 season deserves special mention.

The model ranks it as the best single-season offensive catcher in the dataset. That does not make Raleigh the greatest offensive catcher. It does mean that his 2025 season was extraordinary within his catcher peer group.

This is exactly what the model is designed to capture. It is not simply asking who had the best traditional reputation. It is asking which seasons most disrupted the positional baseline.

By that standard, Raleigh’s 2025 season is historic.

What the Study Shows

The catcher study yields a strong yet nuanced result.

Piazza wins the career score. Piazza wins the seven-season peak score. Piazza wins the balanced score. Bench is second in all three and remains the all-around counterargument. Posada emerges as a surprisingly strong offense-only catcher. Cochrane, Berra, Fisk, Dickey, Hartnett, Tettleton, Simmons, and Carter all form a rich second tier.

The single-season leaderboard adds a modern surprise with Raleigh. It also reminds us that peak seasons and career greatness are different things.

In short:

Career offense: Piazza

Peak offense: Piazza

Best single season: Raleigh, 2025

All-around catcher counterargument: Bench

Most underrated offense-only result: Posada

Conclusion

Catcher is not a position built for easy offensive comparison. The physical toll is too great. The defensive demands are too large. The historical standards shift too much across eras.

That is why peer adjustment helps.

It asks each catcher to stand next to the catchers of his own time. Not against a century of changing run environments. Not against a modern memory of what catcher offense should be. Just against the positional baseline he actually faced.

By that standard, Mike Piazza stands at the top.

He was not merely a good-hitting catcher. He was repeatedly far above what catcher offense normally looked like. He combined power, slugging, run production, and enough on-base value to separate from his peers year after year.

Bench remains the larger all-around shadow. Raleigh owns the astonishing single-season spike. Posada deserves more attention than he usually receives.

But in an offense-only, peer-adjusted study, the answer is clear enough.

The greatest offensive catcher was Mike Piazza.

Right Field: Can You Believe It? (I Can)

Right field is not an ordinary offensive position. It has housed some of the most impressive bats in baseball history: Babe Ruth, Hank Aaron, Mel Ott, Reggie Jackson, Frank Robinson, Larry Walker, Vladimir Guerrero, Aaron Judge, Sammy Sosa, Bryce Harper, and many others.

That makes the question difficult. It is one thing to ask who was a great offensive right fielder. It is another to ask who separated most from the other right fielders of his own time.

That is the purpose of this study.

As in the second-base and third-base analyses, I am not trying to measure total player value. Defense is not included. Baserunning is not included except indirectly through runs scored. Postseason performance is not included. This is an offense-only model. The reasons for this decision will become clear in future posts as the study progresses.

The question is narrow: Who was the most dominant offensive right fielder relative to other right fielders in his own era?

Methodology

Using the Lahman Database, I used Appearances.csv to identify true right-field seasons. That matters because the standard Fielding.csv file groups many players simply as outfielders. Appearances.csv lets us isolate right field specifically.

A player-season qualified if the player had:

At least 50 games in right field

At least 300 plate appearances

For each qualified right fielder-season, I calculated six offensive measures:

OBP

SLG

HR per PA

BB per PA

Runs per PA

RBI per PA

Each category was converted into a z-score within that season’s right-field peer group. The season score was the sum of those six z-scores. Partial seasons were weighted by playing time, with full credit beginning at 600 plate appearances.

The basic formula was:

Season Score = OBP z + SLG z + HR/PA z + BB/PA z + R/PA z + RBI/PA z

This approach compares right fielders only to other right fielders from the same season. The goal is not to compare 1920 directly to 2025. The goal is to measure how far each player stood above his own positional baseline.

Figure 1: Career Offensive Dominance

The career ranking is decisive.

Babe Ruth finishes first with a career peer-adjusted offensive score of 175.1. Hank Aaron follows at 128.6, and Mel Ott is third at 121.3. Reggie Jackson is fourth at 97.9, followed by Frank Robinson, Larry Walker, Aaron Judge, Dwight Evans, Jose Bautista, and Vladimir Guerrero.

This is the first major result: Ruth does not merely win; he dominates.

That gap matters because the method already compares him only with other right fielders of his time. Ruth is not getting credit simply because his raw numbers look large in an historical context. He is getting credit because, season after season, he was far above the offensive standard for right fielders in his own environment.

Aaron and Ott form the next historical tier. Both accumulated long careers of offensive value in right field. Reggie Jackson follows as the next major power figure. Judge is already visible, but his career total is naturally limited by the number of qualified right-field seasons currently in the data.

Figure 2: Best Seven-Season Peaks

The seven-season peak ranking makes Ruth’s dominance even clearer.

Ruth’s best seven-season score is 117.5. The next closest player is Aaron Judge at 72.4, followed by Hank Aaron at 70.6, Reggie Jackson at 67.2, Mel Ott at 67.0, and Jose Bautista at 67.0.

This is perhaps the most striking figure in the study. Ruth’s peak is not simply better. It is operating on a different scale.

Judge is the fascinating modern subplot. His seven-season peak already ranks second in the model. That does not make him the second-greatest offensive right fielder by career value, but it does show how extreme his best seasons have been. In peak terms, he is closer to the historical elite than his career total alone would suggest.

Still, Ruth is alone. He wins the career ranking and the peak ranking.

Figure 3: Career Value Versus Peak Dominance

The scatterplot shows the shape of the field. I have created thousands of these; this one is surprising due to that lonely point off by itself.

Ruth sits in the upper-right corner, far from everyone else. Aaron and Ott are the best long-career challengers, but their peaks sit far below Ruth’s. Judge has a high peak but much less career volume. Reggie Jackson, Frank Robinson, Larry Walker, Jose Bautista, Dwight Evans, and Vladimir Guerrero occupy the next group.

This figure makes the structure of the argument visible. Some players have career value. Some players have peak value. Ruth has both.

Judge’s point is especially interesting because it separates upward. His peak is already historically large, but the career axis has not yet caught up. That creates a different kind of question from the one we had with Ruth, Aaron, and Ott. Judge is not yet a career challenger in this model. He is a peak challenger.

The model therefore gives us two stories at once:

Babe Ruth is the clear historical answer.

Aaron Judge is the most interesting player of today.

Figure 4: Best Individual Offensive Seasons

The individual-season leaderboard is a Ruth exhibit, a testament to his ability.

The top season is Ruth in 1920, with a score of 19.5. Ruth also appears in 1926, 1927, 1931, 1924, 1928, 1923, and 1932. That is remarkable. He does not merely own a great season or two. He owns the shape of the leaderboard.

Aaron Judge breaks through powerfully. His 2022 season ranks third at 17.8, and his 2025 season ranks seventh at 16.1. Those are enormous seasons in this framework. Jose Bautista’s 2011, Bryce Harper’s 2015, Juan Soto’s 2024, Gavvy Cravath’s 1915, Larry Walker’s 1997, and Sammy Sosa’s 2001 also appear.

This figure is useful because it shows that Ruth’s career advantage is not just longevity. It is repeated peak dominance. He has several of the best right-field seasons in the dataset.

Judge’s presence also matters. If the article has a modern hook, it is here. Judge is not being flattered by memory or recency. He is genuinely appearing among the greatest offensive seasons ever in right field, measured against his positional peers.

Figure 5: Ruth Versus the Best Non-Ruth Right Fielder

Figure 5 compares Ruth to the best non-Ruth right fielder in each season of his qualifying right-field career. Ruth primarily played left field in 1921 and was suspended for a good chunk of 1922.

The pattern is uneven, but the conclusion is clear. Ruth repeatedly stands above the best alternative at the position. In some seasons, the gap is enormous. In others, the field is closer. By 1934, the best non-Ruth right fielder edges him, which makes sense. Ruth was no longer at his peak.

The important point is not that Ruth won every season. He did not need to. The important point is that for a long stretch, Ruth regularly produced seasons that were far above even the best of his immediate right-field peers.

This figure also shows why a career sum is useful. Greatness is not just the highest dot on the chart. It is the repeated occupation of the upper range.

Ruth kept returning to that range.

Figure 6: Balanced Offensive Greatness

The balanced score combines career value and peak value over seven seasons.

Again, Ruth is first by a wide margin. His balanced score is 292.6. Hank Aaron follows at 199.2, Mel Ott at 188.3, Reggie Jackson at 165.2, and Aaron Judge at 144.7.

This may be the best single-number summary of the study. It rewards both accumulation and dominance. Ruth wins both.

Aaron and Ott remain the strongest traditional career challengers. Reggie Jackson’s combination of peak and career value grades well. Judge jumps ahead of several longer-career players because his peak is so strong. Frank Robinson and Larry Walker are nearly tied. Jose Bautista also benefits from a concentrated peak.

The balanced score confirms the central conclusion:

Right field has many great offensive players, but one player clearly separates from the field. That player is Ruth.

Figure 7: Offensive Component Profile

The component profile shows how the leading players accumulated their value.

Ruth’s profile is broad and overwhelming. He is excellent in OBP, slugging, home-run rate, walk rate, runs, and RBI. That is what makes him difficult to catch. He is not a one-dimensional power hitter in this model. He dominates across nearly every offensive component.

Aaron’s profile is different. He is extremely strong in slugging, home-run rate, runs, and RBI, but his walk rate is much lower than Ruth’s. Mel Ott is closer to Ruth in patience and power shape, though not at Ruth’s overall level. Reggie Jackson has a strong power signature. Larry Walker shows a more balanced profile, with strong OBP, slugging, and run production. Dwight Evans stands out as a walk-rate player more than a pure slugging outlier.

Judge’s profile is striking because all his values are already substantial despite only seven qualified seasons. He has not yet accumulated the career totals of Aaron or Ott, but his component shape is already elite.

This figure is important because it shows that the same final score can be built in different ways. Ruth’s greatness is not just power. It is power plus patience plus run production plus repeated separation.

Figure 8: Dendrogram of Top Offensive Right Fielders

The dendrogram clusters the top 15 right fielders by offensive shape rather than by final score.

Ruth clusters with Aaron, Ott, and Reggie Jackson, which makes sense. These are major power-production right fielders, though Ruth’s walk component and overall scale make him exceptional even within that group.

Another cluster includes Al Kaline, Aaron Judge, Frank Robinson, Larry Walker, Gary Sheffield, Jose Bautista, Darryl Strawberry, and Dwight Evans. That group is varied, but it reflects combinations of power, patience, and run production across different eras.

A third branch includes Harry Heilmann, Vladimir Guerrero, and Gavvy Cravath. That grouping is interesting because it brings together players with strong offensive production but different relationships to walks, slugging, and run creation.

The dendrogram reinforces the larger point: offensive right-field greatness has several shapes. Ruth’s shape, however, is both broad and extreme.

The Clemente Question

Roberto Clemente is an important cautionary tale. He was a baseball genius. Who had a stronger arm than him? No one. He was a great player and a great human being. He is also one of my all-time favorites.

He does not rank highly in this offense-only model. His career offensive score is only 0.7, with a peak-seven score of 27.8. That may look jarring, especially for a player who is unquestionably one of the great right fielders in baseball history.

But this result reflects the limits of the study, not a dismissal of Clemente.

Clemente’s greatness includes qualities this model does not measure well or at all: defense, throwing arm, baserunning, contact skill, postseason legacy, consistency, and historical significance. He was a magnificent all-around right fielder. He was not, under this particular z-score framework, among the most dominant offense-only right fielders relative to the power-heavy right-field peer group.

That is a distinction I chose to make in this study.

This is not a list of the greatest right fielders of all time. It is a list of the most dominant offensive right fielders.

What the Study Shows

The right-field study is less ambiguous than the second-base study.

At second base, Joe Morgan and Rogers Hornsby created a real tension between career dominance and pure hitting force. At third base, Mike Schmidt was the clear answer, but Eddie Mathews and Chipper Jones formed a strong second tier.

Right field is different.

Ruth wins career. Ruth wins peak. Ruth wins balanced score. Ruth dominates the individual-season leaderboard. Ruth separates visually in the career-versus-peak chart.

The second tier is interesting, but it is still the second tier. Aaron and Ott are the best challengers. Reggie Jackson, Frank Robinson, Larry Walker, Dwight Evans, Vladimir Guerrero, and others deepen the historical field. Judge is the modern peak story. Bautista, Harper, Soto, and Sosa show that extreme right-field seasons continue to appear.

But the main answer is not close.

Conclusion

Right field has often been a position of power. That makes dominance difficult. To stand far above other right fielders, a player must not merely be great. He must be great in a neighborhood already crowded with sluggers.

That is what makes Ruth’s result so striking.

He was not just better than ordinary players. He was better than the other great offensive right fielders around him. He was not just a one-off high peak. He repeated high peaks. He was not just a career accumulator. He was consistently a top performer.

Measured against his own positional peers, Babe Ruth stands as the greatest offensive right fielder in the Lahman Database.

The numbers do not merely confirm the legend. They explain it: he was the best offensive player (at least by these sets of metrics) who ever set foot in right field.

The Shape of Second Base Greatness: A Tale of Two Exceptional Men

Second base has always been an unusual offensive position. It is not first base, where power is expected. It is not shortstop, where defense often dominates. It sits somewhere in the middle, historically shaped by contact hitters, table-setters, high-average stars, defensive specialists, and, every so often, an offensive outlier who bends the position out of shape.

That makes second base a useful test case for a peer-adjusted study.

The question is not simply: who had the biggest numbers?

That would favor certain eras too heavily. It would also blur the positional standard. A second baseman in 1925, 1976, 1999, and 2024 was not being measured against the same offensive environment. The better question is:

Who was the most dominant offensive second baseman relative to the other second basemen of his own time? That is the same method I used for third basemen. The logic is simple. Compare each player only to his same-year positional peers. Convert that separation into z-scores. Then add up the value across seasons.

Greatness, in this framework, is not just production. It is Euclidean distance from the scores of the mediocre players.

Methodology

Using the Lahman Database, I identified second-base seasons through the fielding table, then evaluated only offensive production from the batting table. Fielding was used only to determine who qualified as a second baseman. It was not used in the scoring.

A player is considered season-qualified if the player had:

At least 50 games at second base

At least 300 plate appearances

For each qualified second baseman-season, I calculated six offensive measures:

OBP

SLG

HR per PA

BB per PA

Runs per PA

RBI per PA

Each category was converted into a z-score within that year’s second-base peer group. The season score was the sum of those six z-scores. Partial seasons were weighted by playing time, with full credit beginning at 600 plate appearances.

So the basic structure was:

Season Score =

OBP z + SLG z + HR/PA z + BB/PA z + R/PA z + RBI/PA z

The result is a measure of offensive dominance relative to position and era.

Figure 1: Career Offensive Dominance

The career ranking produces a fascinating result. I must admit, I didn’t expect this. I thought I would see Rogers Hornsby leading the way, how about you?

Joe Morgan finishes first with a career peer-adjusted offensive score of 169.3. Rogers Hornsby is second at 136.7. Eddie Collins, Charlie Gehringer, Jeff Kent, Lou Whitaker, Bobby Doerr, Bobby Grich, Joe Gordon, and Nap Lajoie complete the top ten.

This is not the result many people might initially expect. Hornsby is often thought of as the greatest offensive second baseman ever, and in raw hitting terms, that case is obvious. He was a world-class hitter. But this method is not asking who produced the most extravagant slash lines. It is asking who accumulated the most separation from the second-base baseline across his qualified second-base seasons.

Morgan’s advantage is career breadth. He qualified for 19 seasons in this study. Hornsby qualified for 11. That difference comes into play.

Morgan was not merely good for a long time. He was repeatedly far above the offensive norm for second basemen, especially because his value came from a broad package: on-base ability, walks, power for the position, runs, and enough run production to dominate his peers.

Hornsby remains extraordinary. But Morgan’s long arc wins the career version of the question.

Figure 2: Best Seven-Season Peaks

The seven-season peak ranking narrows the debate.

Morgan’s best seven-season score is 100.5. Hornsby’s is 99.0. That is remarkably close. Charlie Gehringer follows at 70.9, then Joe Gordon, Jeff Kent, Ryne Sandberg, Eddie Collins, Bobby Doerr, Bobby Grich, and Chase Utley.

This figure changes the conversation. Morgan does not win only because of longevity. His peak also stands with Hornsby’s.

That is the key finding of the second-base study.

Hornsby’s offensive peak was legendary, but Morgan’s best run was also historic when measured against other second basemen. His 1970s peak combined elite walk rates, surprising power, run scoring, and positional dominance. He was not a conventional batting-average star. He was something more modern: a player whose value was built from patience, efficiency, speed, and power relative to position.

The gap between Morgan and Hornsby is tiny here. The gap between them and the rest of the field is not. They have clearly separated themselves.

Figure 3: Career Value Versus Peak Dominance

The career-versus-peak scatterplot makes the structure of the argument clear.

Morgan and Hornsby occupy the top-right corner. No one else is near them.

Below them is a second tier: Charlie Gehringer, Eddie Collins, Jeff Kent, Joe Gordon, Ryne Sandberg, Bobby Doerr, Bobby Grich, Lou Whitaker, Nap Lajoie, Chase Utley, Robinson Cano, and Roberto Alomar. These players differ in style, but they share the same general position in the graph. They were great, sometimes historically great, but not quite Morgan-Hornsby great by this scoring system.

The scatterplot also clarifies the Morgan-Hornsby distinction.

Hornsby’s peak is almost identical to Morgan’s, but his career score is lower because he had fewer qualified second-base seasons in the dataset. Morgan’s profile is the stronger combination of peak and persistence.

That does not make Hornsby smaller. It makes the question more precise.

If the question is “Who was the most overwhelming second-base hitter at his absolute best?” Hornsby has a serious case. If the question is “Who accumulated the most peer-adjusted offensive dominance while playing second base?” the answer is Morgan.

Figure 4: Best Individual Offensive Seasons

The single-season ranking is one of the most interesting figures in the study.

The top season belongs to Joe Morgan in 1976, with a score of 20.7. That is an enormous number. It is higher than the best individual third-base season in the earlier study. Hornsby follows with several monster seasons: 1925, 1922, 1924, 1921, 1928, and 1929 all appear in the top twenty.

Morgan also appears repeatedly: 1972, 1975, 1976, 1977, and 1974.

This is where the Morgan case becomes especially strong. He does not merely win by compiling. His best seasons are among the very best second-base offensive seasons in the database.

The list also contains some useful reminders. Joe Gordon’s 1942 season is excellent. Jeff Kent’s 2000 season stands out. Jackie Robinson’s 1952 season appears. Ketel Marte’s 2024 season makes the list, which is a reminder that this method can incorporate modern seasons naturally as long as the Lahman data includes them.

Still, the top of the figure belongs to Morgan and Hornsby.

They are not just first and second in the career rankings. They also dominate the historical single-season landscape.

Figure 5: Rogers Hornsby Versus the Field

This figure isolates Hornsby’s qualified second-base seasons and compares him to the best non-Hornsby second baseman in each year. 1930 is missing due to a broken ankle.

The visual is striking. Hornsby repeatedly towers over the field. His seasons in the early and mid-1920s are so far above the second-base norm that they look almost detached from ordinary positional comparison.

That matters because Hornsby’s case rests on peak intensity. He was not merely a great hitter who happened to play second base. He was an offensive anomaly at a position where that kind of production was not expected.

The same figure also hints at why Morgan can still win the career ranking. Hornsby’s period of extreme second-base separation was shorter. Morgan’s total advantage is spread across more seasons.

Perhaps that is the central tension of the study:

Hornsby was the more explosive offensive force.

Morgan built the stronger peer-adjusted second-base career.

Both statements can be true.

Figure 6: Balanced Offensive Greatness Score

The balanced score combines career value and peak value over seven seasons.

Morgan again finishes first, with a balanced score of 269.7. Hornsby is second at 235.8. Eddie Collins and Charlie Gehringer are nearly tied for third and fourth. Jeff Kent, Lou Whitaker, Bobby Doerr, Joe Gordon, Bobby Grich, and Ryne Sandberg follow.

This figure may be the cleanest summary. It does not ignore peak. It does not ignore career. It gives credit for both.

And by that standard, Morgan is the answer to this post’s question.

This also reframes the usual conversation around second basemen. Hornsby remains the Hall of Famer that he is. Eddie Collins remains one of the great long-career offensive second basemen. Gehringer looks excellent. Jeff Kent’s bat grades very well. Lou Whitaker, Bobby Grich, and Bobby Doerr all emerge as major offensive figures relative to their positions.

But Morgan is the most complete case of offensive dominance.

Figure 7: Offensive Component Profile

The component profile explains why Morgan and Hornsby are both elite, but in different ways.

Morgan’s profile is built around OBP, walks, runs, and balanced power. His walk component is massive. That fits the historical picture. Morgan’s offensive value was not simply about batting average or home runs. It came from controlling the strike zone, reaching base, scoring runs, and adding enough power to separate from other second basemen.

Hornsby’s profile is different. He dominates through slugging, home-run rate, OBP, and RBI production. His profile looks more like a traditional slugger placed at a middle-infield position. In that sense, he resembles the second-base version of Mike Schmidt or Eddie Mathews from the third-base study, though Hornsby’s batting average and slugging environment give him his own signature.

Eddie Collins shows another type of greatness. His OBP and run-scoring profile are excellent, but he does not dominate the home-run component. Jeff Kent is almost the opposite. His power and RBI components are enormous, while his OBP component is relatively modest compared with Morgan, Collins, or Hornsby.

This figure is important because it prevents the study from becoming a single-number exercise. Morgan, Hornsby, Collins, Gehringer, Kent, Whitaker, Doerr, and Grich are not the same kind of hitter. They produced offensive value through different paths.

Figure 8: Dendrogram of Offensive Similarity

The dendrogram clusters the top 15 career scorers by offensive shape rather than by total value.

Several patterns stand out.

Joe Morgan and Eddie Collins cluster together, which makes sense. Both were high-OBP, run-creating, walk-driven second basemen whose value was not primarily based on home-run power.

Rogers Hornsby clusters closer to Nap Lajoie and Jeff Kent. That is interesting because those players represent different eras, but the shape of their offensive value has similarities: strong slugging and run production relative to the position.

Charlie Gehringer, Lou Whitaker, Roberto Alomar, and Chase Utley form a balanced group. These are players whose offensive value was distributed across categories rather than concentrated in one extreme area.

The dendrogram reinforces the larger point: second-base offensive greatness has multiple forms. Morgan’s greatness was not Hornsby’s greatness. Hornsby’s greatness was not Collins’s greatness. Kent’s greatness was not Whitaker’s greatness.

The rankings tell us who separated the most. The clusters tell us how they did it.

What the Study Shows

The second-base study gives us a more complicated answer than the third-base study.

At third base, Mike Schmidt was the clear answer. He won the career, peak, and balanced rankings.

At second base, Joe Morgan and Rogers Hornsby split them up. Morgan wins the career score. Morgan narrowly wins the seven-year peak score. Morgan wins the balanced score. He also owns the best individual season in the study, 1976.

Hornsby, however, remains the great offensive peak figure. His top seasons are astonishing. He appears over and over again on the single-season leaderboard. When he was at his best, he was not simply the best offensive second baseman in baseball. He was operating in another offensive category.

The final answer depends on the exact wording of the question.

If the question is: Who had the greatest offensive second-base career relative to his peers? The answer is Joe Morgan.

If the question is: Who was the most devastating pure hitter ever to spend his prime at second base? The answer may still be Rogers Hornsby.

But because this study combines peak, career, and positional peer adjustment, Morgan comes out on top.

Here are four figures comparing Morgan directly to Hornsby.

Figures 9 through 12 sharpen the Morgan-Hornsby comparison by moving beyond the overall ranking.

Figure 9 shows that Morgan and Hornsby created value in very different ways. Morgan’s edge comes from OBP, walks, and run scoring, while Hornsby’s advantage comes from slugging, home-run rate, and RBI production. In simple terms, Morgan’s profile is patience and pressure; Hornsby’s is impact and force.

Figure 10 explains why Morgan wins the study. Hornsby accumulates value faster in his early second-base seasons, but his qualifying second-base career is shorter. Morgan keeps adding high-value seasons, eventually passing Hornsby and building a clear career advantage.

Figure 11 keeps the debate over the peak alive. Morgan owns the single best season in the comparison, 1976, but Hornsby’s top ten seasons are deeper and more consistently clustered at a high level. That supports the idea that Hornsby remains the great pure-hitting counterargument.

Figure 12 summarizes the tradeoff neatly. Morgan has more broadly valuable seasons, especially at the lower elite threshold, while Hornsby has more seasons above several of the higher dominance cutoffs. So the conclusion holds: Morgan wins the peer-adjusted career argument, while Hornsby remains the explosive peak-rate alternative.

Conclusion

Second base is a position of changing expectations. Sometimes it rewards contact. Sometimes speed. Sometimes defense. Sometimes on-base skill. Rarely does it produce a hitter who bends the position around himself.

Joe Morgan did that. Rogers Hornsby did too.

That is why this debate is so interesting. It is not a debate between greatness and drunken bar patron opinion. It is a debate between two kinds of dominance. Hornsby was an offensive eruption. Morgan was a system of pressure: walks, runs, power, patience, speed, and constant separation from the positional baseline.

Measured against his peers, year after year, Morgan built the strongest offensive second-base profile in the Lahman Database.

Hornsby may still feel larger than life. But Morgan’s advantage is mathematical and historical. He was not merely excellent. He was repeatedly far from normal.

And in this framework, that is what greatness means. Three cheers and a tiger for both men; they were simply outstanding.

The Greatest Offensive Third Baseman Ever

Debates (arguments?) about the greatest third baseman (or any other position) often begin with memory and reputation. Mike Schmidt’s power, George Brett’s bat control, Chipper Jones’s balance, Eddie Mathews’s power, Brooks Robinson’s glove, Adrián Beltré’s longevity, Scott Rolen’s completeness, and the list goes on and on. The cases for players all point in different directions. Surprisingly, Wade Boggs does not appear in this analysis. I admit, I am a bit mystified by this.

But this study asks a narrower question. Not who was the greatest all-around third baseman ever. Not who had the best glove. Not who accumulated the most WAR. The question here is more specific: Who was the most dominant offensive third baseman relative to the other third basemen of his own time?

That phrasing is important. A raw comparison across eras can flatten baseball history. The offensive environment of 1896 was not the same as that of 1968, 1999, or 2013. A third baseman in Eddie Mathews’s era faced a different positional baseline than one in Chipper Jones’s era. The cleanest way to compare them is not to compare them directly at first. It is to compare each player to his own peer group, then analyze the differences.

In other words, greatness here is measured as distance (explicitly Euclidean) from positional normalcy.

Methodology

Using the Lahman Database (yes, the term should be capitalized), I identified third-base seasons through the fielding table, then evaluated only offensive production from the batting table.

A player-season qualified if the player appeared in at least 50 games at third base and had at least 300 plate appearances. Fielding data was used only to establish positional eligibility. It was not used in the score.

For each qualified third baseman-season, I calculated six offensive measures:

OBP

SLG

HR per PA

BB per PA

Runs per PA

RBI per PA

Each category was converted into a z-score within that season’s third-base peer group. The seasonal offensive score was then calculated as the sum of those six z-scores. Partial seasons were weighted by playing time, with full credit beginning at 600 plate appearances.

The scoring framework is straightforward:

Season Score = OBP z + SLG z + HR/PA z + BB/PA z + R/PA z + RBI/PA z

This approach does not ask whether a player was better than a third baseman fifty years later. It asks whether he separated from the third basemen standing around him at the time.

That is the study’s central logic. Make sense? I think so.

Career Offensive Dominance

Figure 1 presents the career ranking, and the result is clear.

Mike Schmidt finishes first with a peer-adjusted career offensive score of 154.8. Eddie Mathews follows at 127.5, and Chipper Jones is third at 108.0. After those three, there is a noticeable step down to Ron Santo, Home Run Baker (a man with 96 career home runs), Alex Rodriguez, Harlond Clift, George Brett, Harmon Killebrew, and Scott Rolen.

The key result is that Schmidt does not merely lead the field; he creates meaningful separation from it.

That is significant because the method is already controlling for era and positional context. Schmidt is not being rewarded simply because he hit a lot of home runs in absolute terms. He is being rewarded because he repeatedly produced offensive seasons that were far above what third basemen were normally producing at the same time.

Mathews and Jones also emerge as historically exceptional. In fact, the top three form a clean offensive hierarchy:

Mike Schmidt
Eddie Mathews
Chipper Jones

Everyone else is fighting for the next tier.

Best Seven-Season Peaks

A career score can sometimes reward longevity more than dominance, so Figure 2 looks at each player’s best seven seasons.

Again, Schmidt is first.

His best seven-season score is 83.7, ahead of Mathews at 77.1 and Jones at 69.9. Alex Rodriguez moves up here, which makes sense. His time as a third baseman was shorter, but his best offensive seasons at the position were enormous. Ron Santo, Harlond Clift, Home Run Baker, Harmon Killebrew, George Brett, and Josh Donaldson also show well.

This strengthens Schmidt’s case because he leads both the career and peak rankings, a combination few candidates can match.

Interestingly, the seven-year view also helps clarify Eddie Mathews’s place. He is not merely a longevity candidate. His peak was close enough to Schmidt’s to make the comparison serious. If this study has a surprise, it may be how strong Mathews looks when third basemen are compared to their own positional eras.

Career Value Versus Peak Dominance

Figure 3, a scatterplot, shows the relationship between career value and peak value. Most players cluster along a rising diagonal, as we would expect. Better players tend to have both stronger careers and stronger peaks.

But the upper-right corner tells the story.

Schmidt stands alone. Mathews is close, but still clearly behind. Chipper Jones occupies the next major position. Alex Rodriguez has a high peak but a shorter third-base career. Ron Santo, Harlond Clift, Home Run Baker, George Brett, and Scott Rolen occupy the next cluster.

This figure reinforces the argument by showing that Schmidt ranks at the top in both career value and peak value.

The figure also shows why the second-place debate is more interesting than the first-place debate. Mathews, Chipper, A-Rod, Santo, Brett, Baker, and Rolen are all great, but they represent different forms of greatness. Mathews combines power and peak. Chipper brings on-base skill and consistency. A-Rod brings a spectacular but shorter third-base run. Santo has a strong all-around offensive profile. Brett has batting excellence, but less power separation than Schmidt or Mathews.

Schmidt is the equilibrium point where peak, career, power, and patience all meet.

Best Individual Offensive Seasons

Figure 4, the single-season ranking complicates the story.

The best individual season in the study is Miguel Cabrera’s 2013 season, with a score of 17.7. Harmon Killebrew’s 1969 season follows at 15.2, then Alex Rodriguez’s 2007 season at 14.2. Jim Thome’s 1996 season, A-Rod’s 2005 season, Eddie Mathews’s 1955 season, George Brett’s 1985 season, Dick Allen’s 1966 season, and Chipper Jones’s 1999 season all appear near the top.

Schmidt does not own the best single season. That is important. His case is not built on a single historical spike. It is built on repetition.

He appears several times in the top twenty: 1981, 1984, 1979, 1974, and 1976. That is the signature of the Schmidt profile. Other players may have reached higher in a single year. Few reached elite separation so often.

Cabrera’s 2013 season deserves special mention. Within this framework, it was an extraordinary offensive third-base season. But Cabrera does not challenge Schmidt in the career ranking because the study is positional. His third-base period was brief compared with players whose careers were more deeply rooted at the position.

That distinction is essential. This is not simply a hitter ranking. It is a ranking of offensive dominance for third basemen.

Schmidt Versus the Best Non-Schmidt Third Baseman Each Year

Figure 5 isolates Schmidt’s run and compares him to the best non-Schmidt third baseman in each season of his qualifying career.

The pattern is revealing. Schmidt was nearly always the best offensive third baseman in a given year; he was repeatedly at or near the top. More importantly, his high-end seasons were not isolated. From the mid-1970s through the mid-1980s, Schmidt regularly produced scores that would be career-defining seasons for many other players.

This is where his dominance looks sustained rather than episodic. The only real blemish is the 1985 season, a year when George Brett was outstanding.

There is also something visually striking about the consistency. Schmidt’s line spends most of its meaningful period in elite territory. That is not normal. Most great players have a few high points, some solid years, and then decline. Schmidt’s offensive identity was unusually stable: power, walks, run production, and enough on-base value to keep the profile broad.

The result is not just a great career. It is a long occupation at the top of the positional landscape.

Balanced Offensive Greatness Score

The balanced score combines career value and peak value, which may be the best single-number summary of the study.

Once again, the top three remain the same:

Mike Schmidt
Eddie Mathews
Chipper Jones

Schmidt scores 238.5, Mathews 204.6, and Chipper 177.9. Alex Rodriguez rises to fourth because of his peak, while Ron Santo, Home Run Baker, Harlond Clift, George Brett, Harmon Killebrew, and Scott Rolen fill out the next tier.

The stability of the top three is important. When a result survives different ways of looking at the same problem, it becomes more convincing. Career score says Schmidt. Peak score says Schmidt. Balanced score says Schmidt. The scatterplot says Schmidt.

Across these measures, the result remains consistent.

Offensive Component Profile

The component profile shows how the top players accumulated value.

Schmidt’s profile is broad and powerful. He leads through slugging, home-run rate, walk rate, runs, and RBI. His OBP component is strong as well, but his separation comes from the combination of power and patience. That is the essence of his offensive greatness.

Mathews looks similar in shape, though not quite as overwhelming. He brings strong slugging, home-run rate, walk rate, and run production. Chipper Jones is different. His OBP and walk components are especially strong, and his profile is less power-heavy than Schmidt’s or Mathews’s. That makes Chipper’s third-place finish more interesting. He gets there not by matching Schmidt’s power profile, but by building a different kind of offensive advantage.

Home Run Baker is also fascinating. His low walk component reflects the style and statistical environment of his era, but his power and RBI components are substantial relative to his peers. George Brett is another contrast case. He does not dominate the home-run or walk-rate categories, but his OBP, slugging, runs, and RBI components keep him in the upper tier.

This figure may be the most useful for interpretation because it shows that “offensive greatness” is not a single thing. Schmidt was the best overall offensive third baseman, but different players reached greatness through different routes.

Key Findings

The main result is straightforward: Mike Schmidt was the greatest offensive third baseman ever by peer-adjusted dominance.

But the more interesting finding is the shape of the field behind him.

Eddie Mathews has a strong claim to second place. Chipper Jones looks like the best modern challenger. Alex Rodriguez’s third-base peak was tremendous, but his positional career was shorter. Ron Santo may be somewhat underrated by traditional public memory. Home Run Baker and Harlond Clift both perform extremely well when measured against their own eras. George Brett remains an offensive giant, though his profile is less power-dominant than Schmidt’s. Scott Rolen, often discussed for his all-around value, still lands in the offensive top ten.

The method also gives us a useful distinction between “best season” and “greatest offensive third baseman.” Miguel Cabrera’s 2013 season may be the highest single-season performance in the dataset, but Schmidt’s case rests on repeated excellence. One peak can create a season. Repetition creates a historical identity.

Conclusion

A position is more than a place on the field. It is a moving baseline. The average third baseman in 1913 was not the average third baseman in 1955, 1981, 1999, or 2013. That is why peer adjustment matters. It lets us ask a better question.

The central question is not simply who produced the biggest raw totals, but rather who separated himself most clearly from the third basemen of his own era.

By that standard, Mike Schmidt stands at the top.

He had a career. He had the peak. He had the power. He had the patience. He had the year-after-year separation that turns excellence into dominance.

The debate over the greatest all-around third baseman may require defense, longevity, postseason value, and other forms of context. But the offensive debate is cleaner.

Simply stated, the greatest offensive third baseman ever was Mike Schmidt.

Postscript

I decided to add another plot. It is worth a look.

Cluster 1 shows Schmidt and Mathews together. This is the pure power-dominance cluster. Their profiles are close because both separate through slugging, home-run rate, walks, and run production.

Cluster 2 shows Chipper Jones standing alone. That makes sense. His profile is more OBP-and-walk driven, with less overwhelming HR-rate separation than Schmidt or Mathews.

Cluster 3 is very informative. It is a category of broad offensive excellence. Ron Santo, Alex Rodriguez, Harlond Clift, George Brett, Scott Rolen, Denny Lyons, and Bill Joyce cluster together. This is a more balanced group, with strong overall offensive value but not the same Schmidt-and-Mathews power signature.

Cluster 4 offers a Power/RBI category in an era-distinct group. Home Run Baker, Harmon Killebrew, Ron Cey, Bob Elliott, and Pinky Higgins group together. This cluster seems shaped by power and run-production value, with less consistent OBP/walk dominance.

The most interesting result is probably that Schmidt and Mathews are paired together, while Chipper Jones separates into his own branch. That supports the idea that the top three are not just different in degree; they are different in offensive performance and analytical structure.

I hope you found this post interesting. I think I am on to second base next.

Payroll Matters in MLB

Payroll Matters in MLB (semantic ambiguity intentional)

There is a simple baseball question that seems to have a straightforward answer. Do teams with big payrolls win more often than the others?

The intuitive answer is yes, right? The richer teams should and do win more often. They can sign stars, absorb bad contracts, buy depth, survive injuries, and patch roster holes at the trade deadline. As the Dodgers know, baseball has no hard salary cap, so money should matter.

And it does. But after looking at MLB Opening Day payrolls and regular-season wins from 2000 through 2025, the more interesting answer is this:

Payroll matters, but it explains far less than we might expect.

Across the full period, the average annual R² is about 0.154. In plain English, Opening Day payroll explained roughly 15.4% of the variation in team wins across seasons.

Surprised? Most of the people I have talked to think the number would be much, much higher.

Figure 1: Payroll Predicting Wins in MLB, R² by Season, 2000–2025

The figure shows the year-by-year R² values from simple linear regressions of wins with respect to Opening Day payroll. Each season is treated separately. The result is a measure of how much of that year’s win variation can be explained by payroll alone.

The first thing that stands out is the instability. This is not a smooth upward line. It is not a story where money gradually takes over baseball. Instead, it is a jagged series of partial explanations.

Some years, payroll matters a lot. Some years, it barely matters at all.

Figure 2: Payroll Predicting Wins in MLB, 2022

The high point in this series is 2022, with an R² of 0.365. That means payroll explained about 36.5% of the variation in wins that season. That is a substantial relationship by baseball standards.

And 2022 makes sense. The high-payroll teams mostly played like high-payroll teams would expect to. The Dodgers won 111 games. The Astros won 106. The Mets and Braves each won 101. The Yankees won 99. The Padres and Phillies also reached the postseason. At the bottom, several low-payroll clubs struggled badly: the Nationals, Athletics, Pirates, Reds, and Royals. The league sorted itself in a way that made payroll look highly predictive.

Figure 3: Payroll Predicting Wins in MLB, 2023

Then came 2023. Take a good look at the figure, and you will see it looks different from the 2022 data. The R² collapsed to 0.027.

That is close to zero, indicating there is virtually no relationship between payroll and wins.

This is the most fascinating turn in the whole study. From one year to the next, payroll went from explaining more than a third of win variation to explaining less than 3%.

Why? Because 2023 broke the payroll model. The Mets spent heavily and won only 75 games. The Yankees won 82. The Padres won 82. The Angels won 73. The Cardinals won 71. Meanwhile, the Orioles won 101 games with a much smaller payroll. The Rays won 99. The Diamondbacks and Marlins both reached 84 wins. The Reds won 82.

In 2022, money and winning lined up. In 2023, they didn’t. That contrast may be the central finding.

Payroll pushes teams toward a certain range of outcomes, but it does not determine where they land. A high payroll gives a team more ways to solve problems. It does not guarantee that the solutions will work.

The other important pattern is the weak stretch from 2012 through 2018. During that period, payroll often had surprisingly little explanatory power. The R² values were mostly low:

2012: .038
2013: .107
2014: .088
2015: .045
2016: .197
2017: .064
2018: .081

That is a remarkable run. It suggests that during much of the 2010s, payroll alone was a poor predictor of regular-season wins. This was the mature analytics era. Front offices had become better at finding value, developing players, managing service time, building bullpens, and exploiting inefficiencies. The market was not perfectly efficient, but it was changing.

Then the relationship strengthened again from 2019 through 2022:

2019: .169
2020: .028
2021: .173
2022: .365

The shortened 2020 season complicates the story, but the surrounding years suggest a partial return of payroll power. One possible explanation is that the richest teams had learned to combine financial strength with analytical sophistication. In the early Moneyball period, efficiency sometimes worked against spending. By the late 2010s and early 2020s, the richest teams were often efficient too.

That may be the modern equilibrium. The market inefficiencies did not disappear. They became harder to monopolize.

The Dodgers, Yankees, Mets, Phillies, Braves, Astros, and Padres were not simply spending. They were spending within more sophisticated baseball operations. Money was no longer just buying accomplished players. It was buying depth, optionality, injury insurance, and a larger margin for error.

Still, the overall composite number remains modest. An average R² of 0.154 means that about 85% of the variation in wins was not explained by Opening Day payroll. That remaining space is where baseball lives.

It includes player development. Injuries. Aging curves. Breakout seasons. Bullpen volatility. Schedule effects. Defensive positioning. Farm systems. Luck in one-run games. Sequencing. Clubhouse decisions. Deadline trades. Prospects arriving ahead of schedule. Veterans flaming out suddenly. This is all part of the complicated and powerful mathematics of 162 games.

Payroll matters. But baseball resists being bought cleanly. Perhaps the better question is not whether money buys wins. It is whether money buys a higher probability of avoiding disaster.

The important point is that rich teams can fail, but they often fail from a higher starting point. Poor teams can succeed, but they usually need more things to go right. There is a thin margin of error. Payroll raises the floor more reliably than it raises the ceiling.

The 2023 season is the best reminder. A massive payroll could not save the Mets. It could not save the Padres from mediocrity. It could not turn the Angels into contenders. At the same time, Baltimore showed how a young, inexpensive core can overturn the model entirely.

Then, 2024 and 2025 moved back toward the middle. Payroll explained 12.9% of wins in 2024 and 22.2% in 2025. Not irrelevant. Not overwhelming.

That may be the honest conclusion of the whole study: Money matters in MLB, but its power is unstable.

It explains a meaningful slice of success, but only a slice. Some years it looks like a structural force. In other years, it looks almost irrelevant. The relationship rises and falls depending on whether high-payroll teams are competent, whether low-payroll teams are rebuilding or emerging, and whether baseball’s many uncertainties conspire to surprise.

The numbers do not support the simplest fan argument: that teams can simply buy their way to wins. But they also do not support the romantic opposite: that payroll does not matter.

The truth is more interesting. Payroll is one variable in a noisy, messy system. It is a resource; it creates possibilities. As for certainty, in Major League Baseball, there is no such thing.

MLB Payroll Versus Wins in 2025

Money matters in baseball; it always has. In fact, it is the answer to any and every question you can conjure. The ability to absorb bad contracts, retain stars, and build depth across a 162-game season creates structural advantages that smaller-market clubs often cannot replicate. A sense of fairness is not built into the MLB’s financial model, yet the 2025 season once again demonstrated that payroll is not destiny.

Using Opening Day payroll data and final 2025 win totals, I examined the relationship between team spending and regular-season success. The result is less powerful than many fans would intuitively expect.

Figure 1 shows the relationship between payroll and wins across Major League Baseball in 2025. The regression equation is:

Wins = 67.72 + 0.078(Payroll in Millions) with an R² value of 0.239.

In practical terms, payroll explained only about 24% of the variation in team wins during the season. That means that roughly three-quarters of team performance emerged from factors outside direct payroll expenditure.

At the upper end of the spectrum, the expected pattern generally held. The Dodgers, Phillies, Yankees, and Blue Jays all combined elite payrolls with strong regular-season performance. Financial commitment clearly raises a team’s expected floor. Wealthier organizations can survive injuries, carry deeper benches, and absorb roster inefficiencies in ways that smaller-market teams often cannot.

But the more interesting story exists away from the regression line (maybe far away).

Milwaukee won 97 games despite operating with a payroll barely above $115 million. Cleveland won 88 games with a payroll just over $100 million. Seattle and Detroit also substantially outperformed expectations in terms of spending. These organizations extracted disproportionate value from player development, roster optimization, and organizational stability.

Conversely, several high-payroll clubs struggled to convert spending into victories. The Angels remained trapped in mediocrity despite nearly $191 million in payroll. Minnesota significantly underperformed expectations. Colorado combined middling payroll with catastrophic results, illustrating that payroll inefficiency can emerge at any spending level.

Perhaps most importantly, the relatively modest R² value suggests that baseball retains a degree of competitive entropy that distinguishes it from some other professional sports. Payroll creates leverage, but not certainty. Injuries, player aging curves, bullpen volatility, prospect development, managerial decisions, and even sequencing luck all remain deeply influential.

The relationship between money and wins, therefore, resembles probability rather than inevitability.

A large payroll buys optionality. It increases the margin for error. It raises the expected baseline. But it cannot fully eliminate randomness, inefficiency, or organizational dysfunction. Meanwhile, well-run smaller-market franchises continue to demonstrate that disciplined systems can partially offset financial asymmetry.

Baseball still resists complete economic determinism, even though the disparity in spending bothers me. That may be one of the reasons (among many) the sport remains analytically fascinating. It may also be why we do not have baseball next season…

Figure 2 reframes the analysis by examining residual performance, or wins above and below payroll expectations. Milwaukee stands out dramatically, outperforming its payroll-derived expectation by more than 20 wins. Cleveland, Seattle, Detroit, and the Cubs also generated unusually strong returns on relatively minimal investment.

Interestingly, the scatter around the regression line itself may tell the larger story. If payroll truly governed outcomes in a deterministic fashion, the league would collapse toward a far tighter relationship. Instead, MLB continues to display substantial dispersion, suggesting that organizational intelligence (and the analytical insight it implies) still matters.

Perhaps that is baseball’s enduring equilibrium. Money shifts probabilities, but it does not fully control outcomes.

Postscript

Anything on your mind? Take a glance back at Figure 1. See the Rockies off all by themselves? That captured my interest. I decided to wait for a postscript to address it. I wonder who else noticed the anomalous Rockies in the plot…

The Rockies are noteworthy because they dramatically underperformed even with their relatively modest payroll expectation. In the regression framework, they occupy the extreme lower-left tail of MLB performance space.

They spent roughly $121 million and won only 43 games.

Based on the regression equation, a team with that payroll would have been expected to win the number of games predicted by the equation:

Wins = 67.72 + 0.078 (120.7)

That evaluates to approximately 77 wins. Instead, Colorado finished more than 30 wins below expectation, making them by far the largest negative residual in the dataset.

What makes this especially unusual is that they were not operating with a tiny payroll. Truly low-payroll teams like the Marlins or Athletics at least possess the structural explanation of minimal financial investment. Colorado spent at a middle-tier MLB level and still produced historically poor outcomes.

Several factors likely contributed. First, Coors Field distortions complicate roster construction. Pitchers often underperform there, and evaluating offensive statistics cleanly becomes difficult. Second, the Rockies have struggled for years with continuity in player development and with integrating modern analytics relative to the rest of MLB. Third, their roster construction often appears caught between rebuilding and competing, producing neither elite prospects nor stable veteran performance. Fourth, they lacked the depth necessary to absorb injuries or underperformance.

Analytically, the Rockies matter in this study because they exert substantial leverage on the regression itself. They are high-residual outliers, meaning they strongly influence the model’s slope and variance structure. If eliminated, we would get a different narrative. Anyone out there want to do that? Any young (or old) budding scientists interested in entering the arena? I am open for business.

In some ways, Colorado becomes the inverse mirror image of Milwaukee. Milwaukee demonstrated extreme organizational efficiency. Colorado demonstrated extreme organizational inefficiency, and that may or may not have been their fault.

That contrast may actually be the central lesson of this post. Organization, along with analytical luck and cleverness, means more than money.

Offensive Friction: A Few Thoughts on Baseball Metrics Along with a Proposal

Baseball statistics (and related metrics) promise clarity. After all, why else would people go to the trouble of creating them?

Metrics offer great appeal; a long, messy season (or even career) gets compressed into a number. A player’s apparent value becomes comprehensible. For example, OPS gives us a quick offensive summary, wOBA improves the weighting of offensive events, and wRC+ places hitters on a clean scale, with 100 being the league average. Statcast data adds another layer, telling us not only what happened, but what probably should have happened.

Each step in the above-referenced progression appears to be progress. But toward what exactly?

The problem is not that all the available metrics are inadequate. The real issue is that they are often answering different questions.

A hitter can have strong results and weak underlying indicators. Another hitter can have excellent contact quality and disappointing production. A third hitter can look ordinary overall but deliver his best moments in the highest-leverage situations. A high BABIP, a favorable run of matchups, or a few well-timed home runs can elevate a fourth.

Which hitter is better, or at least more desirable? The answer depends on what we are trying to measure.

That is why the search for one perfect offensive statistic may be ill-advised. Baseball offense is not one thing. It is a collection of related but distinct realities: production, process, context, opposition, and sustainability.

The more interesting question may not be, “Which metric is best?” The better question may be: Where do the metrics disagree?

The first way to see this is to compare actual production with expected production. If every hitter’s season were in perfect statistical balance, the points would fall neatly along the diagonal line. They do not.

Figure 1. Data for 2026 through the end of April. R² ≈ 0.658

Figure 1 compares actual production with expected production by plotting wOBA against xwOBA for each hitter. The scatter reveals a range of divergence. Players above the line have outperformed their expected results, while those below it have produced less than their contact quality and plate appearances would suggest. Mickey Moniak, for instance, sits well above the line, indicating stronger outcomes than underlying indicators might predict. In contrast, Ketel Marte and Jake Cronenworth fall below it, suggesting that their process may be better than their results to this point. The figure does not resolve which measure is more meaningful, but it makes visible the gap between them, which serves as the starting point for a proposal I will make later in the post.

The limits of a single number

One of the best offensive metrics in wide use is wRC+ because it does something very specific. It estimates a hitter’s total offensive production, adjusts for park and league factors, and places that production on an easy-to-read scale. A 120 wRC+ means a hitter has been 20 percent better than league average. A 90 wRC+ means he has been 10 percent worse than average.

That is useful and elegant. Perhaps more importantly, it is also intentionally incomplete.

wRC+ is not trying to tell us whether a hitter’s production is sustainable. It is not trying to tell us whether he has been lucky. It is not trying to tell us whether his best events came in the most important moments. It is not trying to measure the quality of the pitchers he faced in every plate appearance.

That is not a flaw. It is a design choice.

The trouble begins when we ask wRC+ to do more than it was built to do.

The same is true of expected statistics. xwOBA can tell us something about a hitter’s contact quality and plate appearances. It can suggest whether the underlying process supports his results. But xwOBA is not the same as actual value. A lineout with a high expected value may tell us something important about skill, but it did not move the runners. It did not change the scoreboard.

The expected value and the actual value are both real, but in subtle and nuanced ways.

This is where offensive analysis becomes much more interesting.

Production, process, and context

Consider three hitters.

The first hitter has a high wRC+, a high xwOBA, strong exit velocity, a reasonable BABIP, and a stable strikeout-to-walk profile. There is not much mystery here. The production and the process agree. His production most likely matches his ability.

The second hitter has a high wRC+ but a modest xwOBA. His BABIP is unusually high. His barrel rate is ordinary. His hard-hit rate is fine but not exceptional. The results are good, but the foundation is less convincing. He may still be a good and accomplished hitter, but the numbers are not speaking with one voice.

The third hitter has a poor batting average and mediocre production, but his xwOBA is strong. He hits the ball hard. His launch angle is improving. His walk rate is stable. His BABIP is low. This is the kind of player who may be better than his surface line suggests.

Analyzing the first hitter is straightforward; the real investigation begins with the second and third hitters. They are not noteworthy because one number tells us the answer. They are interesting because several numbers are arguing with each other.

That disagreement deserves to be measured.

Offensive Friction

I am calling this idea Offensive Friction (OFx).

Offensive Friction is not meant to replace wRC+, wOBA, OPS+, xwOBA, BABIP, or Statcast indicators. It is meant to sit beside them and mediate disputes.

Its purpose would be simple: Identify hitters whose offensive indicators disagree.

A low-friction hitter is easy to interpret. His production, expected production, contact quality, plate discipline, and luck indicators all point in roughly the same direction.

A high-friction hitter is harder to interpret. His numbers contain tension. One part of the profile says breakout. Another says regression. One part says unlucky. Another says limited. One part says star. Another says mirage. That tension is the signal.

In conceptual terms:

Offensive Friction = the variance among a hitter’s standardized offensive indicators

The inputs could include:

wRC+

xwOBA

BABIP

Barrel rate

Hard-hit rate

Average exit velocity

Launch angle

Walk rate

Strikeout rate

Chase rate

Context value

Each metric would be converted into a standardized score. Then we would measure how widely those scores spread apart.

A hitter whose scores cluster together would have low Offensive Friction.

A hitter whose scores scatter across the map would have high Offensive Friction.

This would not tell us who is having the better season, but it would tell us who deserves a closer look.

Once the indicators are standardized, we can ask a different question: not who has the best offensive production, but whose profile contains the most tension.

Figure 2. Data for 2026 through the end of April.

Figure 2 introduces the idea of Offensive Friction in its simplest form by ranking hitters according to the degree of disagreement across their standardized offensive indicators. Rather than asking who has been most productive, the figure asks whose statistical profile is the most internally unstable. Players at the top of the chart, such as Cedric Mullins, exhibit the widest spread across metrics, with some indicators suggesting strength and others pointing in a different direction. Others near the top, including Luis Arraez and O’Neil Cruz, show similar patterns of tension. By contrast, players further down the list have profiles in which the underlying numbers cluster more tightly together, indicating a more coherent and interpretable performance. The purpose of the figure is not to evaluate quality, but to identify where the numbers themselves are in disagreement, highlighting the players who warrant closer inspection.

Why disagreement matters

This is the part that is perhaps most interesting.

Baseball analysis usually treats disagreement as a problem to be solved. One metric says this. Another metric says that. We want to know which one is right, or at least most useful.

But maybe the disagreement itself is what we should be after.

A hitter with a 150 wRC+ and a 150 xwOBA+ is excellent, but not analytically mysterious. His results and process agree.

A hitter with a 150 wRC+ and a 100 xwOBA+ is different. His season may be productive, but the underlying indicators suggest caution. Maybe he has been fortunate. Maybe he has exploited a particular defensive pattern. Maybe he has hit a few poorly struck balls at perfect times. Maybe the expected model is missing something.

Either way, the disagreement is worth studying.

The reverse is also true. A hitter with an 85 wRC+ and a 125 xwOBA+ may be a rebound candidate. His results are poor, but the contact quality suggests something better. That does not mean improvement is guaranteed. It means the surface line may not be telling the full story.

This is where Offensive Friction could be useful. It would act as an alert system.

High friction would say: Do not stop at the leaderboard. Something interesting is happening here.

The equilibrium idea

There is another way to think about this.

Baseball performance is often moving toward equilibrium.

A hitter’s batting average may run hot for a few weeks. His BABIP may drift above his career norm. His home run rate may spike. His strikeout rate may briefly collapse. Early in a season, small samples can make ordinary players look transformed and struggling players look finished.

But over time, many numbers begin to settle.

Not always. Players do change. Swing paths change. Plate discipline changes. Strength changes. Health changes. Aging changes everything.

Still, the concept of equilibrium matters.

A hitter is close to offensive equilibrium when his production matches his process. His wOBA is close to his xwOBA. His BABIP is not wildly out of line with his batted-ball profile. His strikeout and walk rates fit his established skill set. His power output would be supported by contact quality.

A hitter is out of equilibrium when those pieces do not line up.

That disequilibrium can mean several things.

It can mean luck.

It can mean injury.

It can mean a real skills change.

It can mean a player is being misread by traditional statistics.

It can mean the model is missing something.

This is why the disagreement matters. It is not just noise. It is a clue.

A possible classification system

Offensive Friction could help classify hitters into types.

Type	Profile	Interpretation
I	High production, high process, low friction	The numbers agree
II	High production, weak process, high friction	Results may be ahead of skill
III	Low production, strong process, high friction	Better than the surface line
IV	Ordinary overall profile, high leverage value	Value concentrated in key moments
V	Average production, average process, low friction	Little mystery
VI	Strong changes across some indicators, conflict across others	Real change or temporary spike

This kind of framework would be more useful than another leaderboard.

It would not simply tell us who ranks first. It would tell us what kind of interpretive problem each hitter presents.

That is important because a baseball season is not just a sorting exercise, it is a diagnostic exercise. We are not only asking who has performed well. We are asking what that performance means.

Friction tells us that the numbers disagree. The Equilibrium Gap tells us the direction of that disagreement.

Figure 3. Data for 2026 through the end of April.

Figure 3 places Offensive Friction alongside overall production, allowing us to see not just how well a hitter has performed, but how stable or interpretable that performance is. The horizontal axis measures the degree of disagreement among a player’s underlying indicators, while the vertical axis reflects his overall offensive output. The quadrant structure provides a simple framework: hitters in the upper left combine strong production with internal consistency, while those in the upper right are producing at a high level but with profiles that contain tension, making them less certain going forward. The lower right quadrant is especially interesting, as it captures players with weak results but high friction, suggesting that their underlying indicators may point to something better than the surface line. Cedric Mullins, for instance, falls into this region, pairing low production with a highly unstable profile. Meanwhile, players like Luis Arraez and O’Neil Cruz occupy the high-friction, higher-production space, where strong results coexist with less agreement beneath the surface. The figure does not resolve which interpretation is correct, but it identifies where the most interesting analytical questions reside.

The philosophical problem

Every baseball metric contains a philosophy.

OPS values simplicity.

wOBA values proper event weighting.

wRC+ values context-neutral offensive production.

xwOBA values underlying process.

WPA values game situation and timing.

BABIP points us toward luck, contact profile, and defensive interaction.

None of these numbers is the whole truth. Each one chooses a version of its specific truth.

That is why one-number arguments can become misleading. A player can be more valuable than he is skilled. He can be more skilled than he has been productive. He can be productive in a way that is unlikely to continue. He can be unlucky without being good. He can be lucky and still be excellent.

The categories overlap, but they are not identical.

This is why I am prososing the idea of Offensive Friction. It does not pretend to solve all of this. It begins by admitting the complexity.

The goal is not to flatten the hitter into one final answer.

The goal is to identify where the narrative bends or even breaks.

What this would add

A metric like Offensive Friction would be especially useful early in the season.

In April and May, leaderboards are unstable. A few bloop hits can inflate a batting average. A few warning-track outs can suppress a slugging percentage. One series in a favorable ballpark can distort the picture. One bad week can make a good hitter look lost.

A friction model would help distinguish stable from unsettled performance.

It could identify:

players whose hot starts are supported by process,
players whose hot starts look fragile,
players whose poor results hide strong underlying skill,
players whose surface numbers and expected numbers are beginning to diverge,
players whose profiles have genuinely changed.

That is more interesting than simply ranking hitters. It gives us a way to ask better questions.

The same friction score can come from very different profiles. A radar view helps show why one high-friction player may be a mirage, while another may be a hidden riser.

Figure 4. Data for 2026 through the end of April.

Figure 4 shifts the focus from outcomes to structure. Each polygon represents a hitter’s standardized offensive profile across several underlying indicators, allowing us to see not only how good a player has been overall, but also how his components align or diverge. A more balanced, compact shape suggests agreement among metrics and a profile closer to equilibrium. A jagged or uneven shape reveals tension, where certain indicators pull in different directions. Cedric Mullins, for example, displays a visibly uneven profile, with strengths in some areas offset by weaknesses in others, a hallmark of high friction. Ketel Marte shows a more coherent structure, with metrics that move together more consistently. Jake Cronenworth sits between these extremes. The purpose of the figure is not to rank hitters, but to reveal the internal shape of their performance, highlighting where the underlying indicators agree and where they do not.

Conclusion

The future of offensive analysis (and defensive and pitching as well) may not be another statistic that claims to replace the old ones. It just might be a model that explains why the old ones disagree.

That is the larger lesson. Baseball offense (and defense) is not a single reality. It is actual production, expected production, contact quality, plate discipline, timing, opposition, luck, and sustainability. Each metric captures part of that structure. None captures all of it.

So maybe the most interesting hitters aren’t always the best. Maybe they are the hitters whose numbers have not yet settled into agreement.

That is where the analysis should begin. Because sometimes the story is not found in the statistics themselves. Sometimes the story is found in the friction between them.

The Shape of the American League (so far): A Three-Dimensional Look at Team Strength

Wins and losses tell us what has happened. Composite metrics can help explain why.

Using offensive, pitching, and defensive data through May 12, 2026, I combined multiple American League team metrics into a standardized z-score framework. Each category was normalized relative to league averages, allowing offensive production, run prevention, and fielding quality to be evaluated on the same scale.

Rather than relying on a single statistic, this approach attempts to measure organizational balance. Teams receive positive scores when they perform above league average and negative scores when they fall below it. For pitching categories such as ERA and WHIP, lower values were inverted so that stronger performance always resulted in higher z-scores.

The result is less a standings table and more a multidimensional map of each team’s underlying quality.

Figure 1: Composite AL Team Strength Through May 12, 2026

Yeah, the Yankees stand alone. They are out there by a large margin.

New York’s profile is unusually complete. They combine the league’s strongest offensive output with elite pitching performance, producing separation that becomes obvious once the categories are standardized. The offensive metrics are overwhelming enough on their own, but pairing them with the AL’s best ERA and WHIP creates a profile that resembles something other than just a hot start.

Perhaps most importantly, the Yankees are not merely winning through one dominant dimension. Many early-season contenders are sustained by either explosive offense or temporary pitching overperformance. New York grades strongly in both simultaneously. That is a big deal.

The Astros occupy a fascinating second tier. Houston’s offense remains extremely dangerous, leading the league in batting average while ranking near the top in slugging and run production. Yet the pitching profile is significantly weaker than expected, especially relative to prior Astros teams. Their overall placement illustrates how overwhelming offensive production can partially compensate for poor run prevention, at least over a 40-game sample.

The Athletics may be the most surprising analytical team in the league so far. Their composite score benefits from a quietly balanced structure. They field exceptionally well, avoid major pitching collapse, and generate enough offense to remain consistently above average across categories. This is not a team built around dominance. It is a team built around the absence of glaring weakness.

Cleveland fits a similar pattern. The Guardians do not dominate the league in any single category, but they remain consistently competitive across all three phases of the game. Their strong fielding profile, solid strikeout numbers, and competent offense produce one of the most stable composite structures in the American League.

Seattle grades better analytically than its current record suggests. The Mariners continue to pair strong pitching with quality fielding, even though the offense remains uneven (at best). Their underlying structure implies a team that could improve substantially if the bats normalize.

Meanwhile, Tampa Bay presents one of the more interesting contradictions in the league. The Rays possess one of the best records in the AL, yet their composite z-score profile remains only modestly above average. This may indicate sequencing luck, strong leverage performance, or simply an ability to maximize close games. Interestingly, this has long been a recurring characteristic of Tampa Bay baseball.

At the bottom of the rankings sit Baltimore and Boston. Neither team displays a single catastrophic weakness. Instead, the issue is cumulative mediocrity. Once standardized, multiple slightly below-average categories compound into significantly negative total scores. The Orioles, in particular, have struggled to prevent runs while failing to separate offensively from the league middle.

This raises an important analytical point. Baseball teams are often discussed in singular terms: “great offense,” “elite rotation,” “bad defense.” But actual team quality emerges from interaction effects across systems. Strong fielding can amplify pitching. High-strikeout staffs reduce defensive volatility. Power-heavy offenses can partially absorb bullpen instability.

The z-score approach attempts to capture some of that interconnected structure. As you know, it is a favorite strategy of mine.

No model perfectly predicts future outcomes, especially in May. Small samples remain volatile. Injuries reshape rosters quickly. Regression arrives unevenly. Yet early-season standardization can still reveal organizational identity. Some teams already appear structurally coherent. Others appear fragile despite respectable records.

And at the moment, one conclusion appears difficult to avoid: The Yankees are not simply leading the American League; they are performing as a dominant team would.

Defensive Ecosystems Behind the Plate: How Good is Patrick Bailey?

I grew up in, and still live in, Northeast Ohio. I have long suffered from rooting for the Indians, now known as the Guardians. It hasn’t been a pleasant journey. The last time we won a World Series was 1948, and I do not see another victory on the horizon. So it goes.

I woke up early this morning after hearing we traded for Patrick Bailey, a two-time Gold Glove-winning catcher. I knew Bailey was good. I decided to see just how exceptional he is behind the plate. I used 2025 data for the following study.

Baseball analysis often reduces catchers to a handful of familiar metrics. Framing. Pop time. Arm strength. Caught stealing percentage. Blocking runs. Yet the position itself resists simple categorization. Some catchers suppress the running game through elite exchanges and quick releases. Others survive on receiving skills and pitch presentations. A few manage to combine multiple defensive strengths into unusually complete profiles.

This analysis attempts to move beyond simple rankings by examining the structure of catcher defense. Rather than asking merely who the best defensive catchers were, I wanted to explore a deeper question: Are there distinct defensive ecosystems among modern MLB catchers? Note that this post does not consider how a catcher handles a pitching staff.

To investigate this, I combined multiple publicly available defensive datasets covering:

blocking
throwing
framing
exchange time
pop time
arm strength
caught stealing metrics
related subcomponents

Every metric was standardized using z-scores to allow catchers to be compared on a common scale (something I often do). From there, the project unfolded in several stages:

creation of composite defensive scores
principal component analysis (PCA)
hierarchical clustering
dendrogram construction
an “Unusualness Index” measuring statistical distance from the average catcher profile

The result was less a ranking exercise and more an exploration of what can be termed defensive geography. And yes, all this work was done because I was curious about our new catcher.

Building the Defensive Landscape

The first step was to construct an overall defensive score across three broad categories: blocking, throwing, and framing. Each category itself was built from multiple underlying z-scored metrics.

This approach allowed catchers to be evaluated across multiple dimensions simultaneously rather than through isolated statistics. But even this composite score quickly revealed an important limitation: Two catchers could arrive at nearly identical defensive totals through very different defensive pathways. That observation became the central motivation for the clustering analysis.

PCA and Defensive Geography

Principal Component Analysis compresses high-dimensional data into a smaller number of interpretable axes. I don’t know about you, but I find it very difficult to think in anything more than two dimensions.

In this dataset:

PC1 explained approximately 36% of total variance
PC2 explained roughly 19%

Together, they created a two-dimensional map of modern catchers’ defensive strategies.

The PCA visualization immediately suggested that catchers naturally separate into different defensive archetypes rather than forming a single continuous population. The discussion does get nuanced.

Some clustered around framing skill. Others around throwing and athleticism. A handful appeared unusually isolated. Most notably, Patrick Bailey emerged not only as one of the strongest overall defenders in the dataset, but also one of the most statistically unusual.

Figure 1: PCA Map of Defensive Catcher Profiles

The clusters in the PCA plot represent groups of catchers with similar overall defensive structures rather than similar rankings. A catcher can therefore occupy the same broad defensive tier as another player while still existing within an entirely different defensive ecosystem. This result somewhat surprised me.

Hierarchical Clustering and Catcher Archetypes

To explore those ecosystems further, I applied Ward hierarchical clustering to the standardized defensive profiles.

Unlike simple rankings, hierarchical clustering groups players according to the shape of their statistical profiles. I find this to be an interesting way to look at the data. For example, Patrick Bailey and Alejandro Kirk finished with very similar overall defensive scores. Yet the clustering analysis separated them because they appear to provide defensive value through different skill combinations.

Bailey profiles as a rare hybrid:

elite framing
strong throwing traits
positive blocking metrics

Kirk, meanwhile, appears more specialized toward:

framing
blocking
receiving skill

The dendrogram reveals these structural differences visually.

Figure 2: Dendrogram of Top Defensive Catchers

Several major catcher ecosystems emerged:

Cluster 1: Elite Defensive Hybrids

These catchers combined strong framing with excellent throwing skills. Representative players:

Patrick Bailey
Austin Hedges
Tyler Heineman

Cluster 2: Athletic Throwing Specialists

These catchers leaned heavily into:

arm strength
exchange speed
suppression of the running game

Representative players:

J.T. Realmuto
Endy Rodríguez

Cluster 3: Balanced Traditional Catchers

The statistical center of gravity for the position.

Competent across categories without extreme specialization.

Representative players:

Gabriel Moreno
Christian Vázquez

Cluster 4: Offense-First or Declining Defenders

Catchers whose defensive metrics trended negatively despite offensive value or prior reputations.

Representative players:

Salvador Perez
Yainer Diaz

Cluster 5: Extreme Outliers

In this case, Agustín Ramírez emerged as a statistically isolated profile unlike any other catcher in the dataset.

The Unusualness Index

One of the more interesting outputs of the project was the creation of an “Unusualness Index.” Conceptually, it measures how far a catcher’s defensive profile lies from the league-average catcher.

Mathematically:

Large values indicate:

rare defensive combinations
extreme strengths or weaknesses
hybrid skill profiles
statistical isolation

Interestingly, some of the most unusual catchers were not necessarily the best overall defenders. That distinction may be one of the most important findings in the study. Elite value and statistical uniqueness are related, but they are not identical concepts.

Figure 3. Most unusual catchers in 2025.

The Patrick Bailey Question

Perhaps the most fascinating result involved Patrick Bailey. Bailey ranked near the top of the defensive leaderboard while also appearing among the most unusual defensive profiles in the dataset. That combination is rare.

Most players become unusual because they possess one overwhelming specialization or weakness. Bailey appears unusual because he performs unusually well across multiple difficult defensive dimensions simultaneously.

The clustering analysis, therefore, suggests that Bailey is not merely “good.” He may represent a relatively uncommon defensive archetype altogether. The man is special.

Figure 4. Most accomplished defensive catchers in 2025.

Final Thoughts

Traditional baseball analysis often searches for single metrics capable of defining defensive quality. But catcher defense appears fundamentally multidimensional.

There is no single pathway to defensive value behind the plate.

Some catchers thrive through receiving.
Others through athleticism.
Others through balance.
A few through genuinely rare hybrid profiles.

The PCA map and dendrogram reveal something that simple rankings cannot: catcher defense is not a ladder. It is an ecosystem. And within that ecosystem, certain players appear to occupy unusually isolated terrain. I can’t wait to see Patrick Bailey in a Guardians uniform.

The Shape of a Decision: MLB Catcher Stances

Back in the 1970s, I had a small black-and-white TV in my bedroom. I would watch baseball games late into the night as I fell in and out of sleep. I was, of course, a Cleveland Indians fan, so those games were at the top of my list. If the Indians were off, and I switched to channel 35, and the antenna was just so, I could get Pittsburgh Pirates games. I watched a lot of their games as well.

One of my best memories of those Pirates games is watching the great Manny Sanguillén catch. He was unusual in that he would drop a knee while waiting on the pitch. I also recall him sticking the other leg way out to the side so that he could get lower in his stance. I don’t recall any other catchers dropping to a knee in that era. He was a great player, and he remains one of my all-time favorites.

As I watch games today, I am seeing all the catchers drop to a knee as the pitcher winds up. Certainly, this has to be worthy of a post, right? As it happens, I found some interesting stuff. The following two figures tell part of the story, and it is a story worth hearing.

The first figure is straightforward. The percentage of pitches received from a one-knee stance rises from 23% in 2020 to 96% in 2026. What begins as a minority behavior becomes, in short order, the default condition of the position. By the end of the period, the alternative has nearly disappeared.

The second figure complicates this.

Instead of levels, it shows change. Year-to-year percentage growth in one-knee usage spikes dramatically early, then declines just as quickly. The initial jump exceeds 100%. After that, the rate of increase falls, first sharply, then gradually, until it approaches zero.

Taken together, these figures describe something more precise than simple adoption. They describe timing.

Interestingly, the decision appears to occur well before the endpoint. The first figure suggests a continuous rise through 2026. The second suggests that the meaningful shift happens earlier, closer to 2021–2023. After that, the system is no longer deciding. Decisions have been firmed up, and consolidation has taken place.

Perhaps most importantly, this pattern aligns with a familiar structure. Early adopters move aggressively, often extracting outsized value. My bet is that this is due to pitch framing. The rest of the system follows, not because the marginal gains remain large, but because the uncertainty has been resolved. Once that threshold is crossed, the behavior spreads regardless of diminishing returns.

This raises a natural question. If the rate of change collapses while the level continues to rise, what is driving the final stages of adoption?

The answer is likely institutional rather than individual. At some point, the technique ceases to be optional because the data says that it is the right thing to do. Consequently, it becomes embedded in instruction, in development, and (most importantly) in expectation. Young catchers entering the league are not choosing the one-knee stance. They are inheriting it.

This is where the first figure can mislead if taken alone. A rising line suggests ongoing discovery. The second figure suggests the opposite. Discovery is front-loaded. What follows is replication.

There is also a subtle implication for evaluation. If most of the informational gain occurs early, then later adopters are operating in a different environment. They are not testing a hypothesis. They are implementing a standard. Any performance differences observed in the later years must therefore be interpreted within a system that has already converged.

That convergence is the quiet endpoint of the process. By 2025 and 2026, the rate of change is minimal. Not because the idea has failed, but because it is clearly the right thing to do. The system has reached equilibrium.

And so, the two figures resolve into a single observation. The transformation of catching technique did not take six years; it took two or three.

Now comes the surprising part. I wondered which knee should be dropped. I know all the catchers are right-handed, so handedness is not a consideration. Fortunately, I found raw data on this. Take a look at the following figure. I find it fascinating.

The figure reveals a subtle but meaningful asymmetry in how the one-knee stance has been adopted. Early in the period, the right-knee-up configuration is more common, but over time, the balance shifts decisively toward the left-knee-up orientation. By 2026, the split is no longer close, with the left-knee-up approach clearly dominant. Interestingly, this suggests that the evolution of the stance is not simply about going to one knee, but about settling into a preferred directional setup. The change appears gradual rather than abrupt, implying that once the broader adoption decision was made, the league continued to refine how the stance is executed rather than whether it should be used at all.

How about that? As of now, I have no idea why the preference shifted. The only thing I know is that it certainly was data-driven. My best guess is that it has something to do with giving the umpire a certain perspective when pitches are being framed. That look being the one the catcher wants the ump to have.

I am open to ideas. If you like, let me know what you think in the comments. This post certainly is fodder for a spirited discussion.