Segregation-adjusted WAR leaderboard

Moderator: Palmtana

  • Author
  • Message
Offline

MaxPower

  • Posts: 734
  • Joined: Sat Sep 24, 2016 2:12 am

Segregation-adjusted WAR leaderboard

PostTue May 10, 2022 1:14 pm

Got a little project to share that isn't directly related to Strat but this community might find it interesting.

We now know enough about the Negro Leagues to where people smarter than me are able to project how those players would have performed if allowed to compete in organized ball. Going a step further, it's possible to add those career projections (called Major League Equivalencies or MLEs) into the AL and NL, subtract the white players they would have replaced, and adjust everyone's replacement level accordingly, thereby modeling a historical integrated MLB. And that's what I've done.

Here is a PDF of the top 500 WAR totals adjusted for segregation, era, and length of schedule. Catchers also get a lower replacement level so that they get fair representation amongst the top 278 HOF-eligible players* (278 is the number of players currently in the HOF; I will call this set of players the T278 going forward). The scores listed are the players' adjusted WAR total divided by the 278th-best HOF-eligible player's total. So Jake Peavy's score is 82, meaning his adjusted WAR total is 82% of the player at the HOF borderline, Buzz Clarkson (whose score is 100).

Into the weeds...

The schedule adjustment is simply every player season gets the average of his actual WAR and his WAR per 162 games. The only players who don't receive it are pre-1893 pitchers.

The era adjustment scheme is based on slicing baseball history into 6 eras:

1 1871-1892 Early
2 1893-1919 Deadball
3 1920-1946 Liveball
4 1947-1968 Integration
5 1969-1992 Expansion
6 1993-2022 Modern

The overall thing I'm trying to do is get each post-deadball era equal representation within the T278 by adjusting their replacement levels. Here are the figures for each era in terms of % of total HOF-eligible PA taken by T278 players.
Code: Select all
   era     PA%
1993-2022 0.136
1969-1992 0.138
1947-1968 0.141
1920-1946 0.140
   whites 0.077
1893-1919 0.089
   whites 0.073
1871-1892 0.079
   whites 0.079

As you can see, the liveball era is the hinge that I use to determine what % to target in earlier eras. When adding in Negro League MLEs to the liveball era and targeting the same overall PA% as later eras, white T278 hitters end up accounting for 7.7% of total PA. Therefore in earlier eras I attempt to limit white T278 hitters to ~7.7% of PA as a way to adjust for the unearned advantage they received from playing against only other white players. If anything, this adjustment is generous to white players, judging by the dominance of dark-skinned players in the integration era.

Pitching is trickier because you want to reward the heavier individual workloads of earlier eras, therefore equalizing T278 IP% across eras doesn't work like equalizing PA% does. Here are the results of the method I landed on:
Code: Select all
   era      A    B   C     D
1993-2022 0.076 175 0.42 0.181
1969-1992 0.094 205 0.50 0.189
1947-1968 0.095 211 0.51 0.186
1920-1946 0.099 210 0.51 0.196
   whites 0.057          0.113
1893-1919 0.083 296 0.72 0.116
   whites 0.071          0.099
1871-1892 0.099 413 1.00 0.099
   whites 0.099          0.099
   
A = % of HOF-eligible innings thrown by T278 pitchers
B = average IP/season thrown by T278 pitchers
C = B/413(% of average T278 yearly workload compared to pre-1893 T278 pitchers)
D = A/C

So D is the figure I attempt to equalize, essentially a workload-adjusted IP%. And once again the liveball era is the hinge: I find that after adding in Negro League MLEs, white T278 pitchers get a figure of 11.3% in column D, so that is the figure I target for whites in earlier eras.

Overall, T278 hitters account for 12.7% of total HOF-eligible PA in organized baseball, while T278 pitchers account for 9.0% of total HOF-eligible IP.

If this all sounds complicated, it's actually dead simple. Apart from the schedule adjustment, literally the only adjustments being made are fiddling with replacement levels until the eras (and catchers) are equally represented.

Obviously none of this is meant to be extraordinarily precise. In particular, I think the MLEs for Negro Leaguers are very likely conservative. Rather the purpose is to give an idea of the shape of what the history might have looked like, especially with regards to the effects of a higher replacement level on pre-integration players. The next step will be to add some kind of bonus for peak performance to get a better idea of HOF-worthiness. But as a simple baseline, I still find the current results highly interesting, hope you do as well. Here again is the PDF of the Top 500.

*edit 6/10/22: I revised my catcher adjustment after finding an error in my original calculations. The new method is to simply increase catchers' positional adjustment until T278 catchers have more appearances than one of the infield or outfield positions. So, second base is the next most scarce position in the T278 in terms of appearances, and catchers get adjusted up until games caught exceeds 2B appearances.
Last edited by MaxPower on Wed Apr 12, 2023 4:40 am, edited 12 times in total.
Offline

MaxPower

  • Posts: 734
  • Joined: Sat Sep 24, 2016 2:12 am

Re: Segregation-adjusted WAR leaderboard

PostTue May 10, 2022 1:24 pm

Post your favorite MLE-derived discoveries, for me it's Luke Easter and Sam Jethroe, had never heard of either before.
Offline

FrankieT

  • Posts: 1312
  • Joined: Sat Mar 03, 2018 12:07 am
  • Location: Usually Somewhere Else

Re: Segregation-adjusted WAR leaderboard

PostTue May 10, 2022 5:15 pm

Interesting. Sorry I don't have a relevant post. This kind of analysis reminds of MARCPELLETIER's great stuff he would put together every now and then.
Offline

MaxPower

  • Posts: 734
  • Joined: Sat Sep 24, 2016 2:12 am

Re: Segregation-adjusted WAR leaderboard

PostTue May 10, 2022 6:19 pm

I take that as high praise indeed!
Offline

Hack Wilson

  • Posts: 1079
  • Joined: Thu Aug 23, 2012 6:16 pm

Re: Segregation-adjusted WAR leaderboard

PostTue May 10, 2022 7:33 pm

Fantastic contribution, analytics, but still trying to wrap my brain around it. Either way, you're ripe for a front office job in the MLB, seriously should try.

I couldn't find Josh Gibson in this:
https://onedrive.live.com/?cid=bbaea5cef483dab4&id=BBAEA5CEF483DAB4%2170673&ithint=file%2Cpdf&authkey=%21AEhKCbp07Ex9pQk

But I might be blind as a bat.
Offline

MaxPower

  • Posts: 734
  • Joined: Sat Sep 24, 2016 2:12 am

Re: Segregation-adjusted WAR leaderboard

PostTue May 10, 2022 7:59 pm

Hack Wilson wrote:Fantastic contribution, analytics, but still trying to wrap my brain around it. Either way, you're ripe for a front office job in the MLB, seriously should try.

I couldn't find Josh Gibson in this:
https://onedrive.live.com/?cid=bbaea5cef483dab4&id=BBAEA5CEF483DAB4%2170673&ithint=file%2Cpdf&authkey=%21AEhKCbp07Ex9pQk

But I might be blind as a bat.

Thanks! I promise this was all pretty simple though, the really complex stuff is the actual MLEs, where I wouldn't even know where to start in terms of creating them.

Gibson is ranked #108 with a score of 140, which to me is disappointingly low. He died young though, and more controversially, the guy who does the MLEs moves him to first base early in his career. The idea is there is no precedent for a hitter that good staying at catcher, so had he been allowed in organized ball, he likely would've mirrored Jimmie Foxx's path.
Offline

labratory

  • Posts: 423
  • Joined: Sat Sep 29, 2012 11:33 am

Re: Segregation-adjusted WAR leaderboard

PostTue May 10, 2022 8:06 pm

I started reading this while watching the basketball game. After a few minutes I noticed that I'd forgotten all about the game.
Great stuff!
Offline

Hack Wilson

  • Posts: 1079
  • Joined: Thu Aug 23, 2012 6:16 pm

Re: Segregation-adjusted WAR leaderboard

PostTue May 10, 2022 8:23 pm

My gut feeling is Josh Gibson should be in the top 20 all time, probably top 10. Analytics aside, many historical accounts and contemporary witnesses say he was as great in his prime as the Babe. Yes, a shorter career. There's no way to quantify what people say about a player in those periods, but that was the feeling, the understanding about Josh Gibson. #108 is very odd by the guy doing the MLEs. So, I understand he might not have played catcher his whole career, but moving him to first base is a hypothetical, did not happen. He might have moved to left field. Who knows. We just have to take what happened as reality. This is the problem with making projections.

Anyhow, absolutely great work by you! Truly interesting.
Offline

Hack Wilson

  • Posts: 1079
  • Joined: Thu Aug 23, 2012 6:16 pm

Re: Segregation-adjusted WAR leaderboard

PostTue May 10, 2022 8:36 pm

So Baseball-Reference has collected data on the Negro Leagues, as much as they can reasonably verify:

Josh Gibson career OPS+ -- 214
Babe Ruth career OPS+ -- 206

Think this would put Gibson much, much higher than #108. True, shorter career.

It's really hard to compare across eras and different leagues, some things work out, other things like Gibson get odd.
Offline

MaxPower

  • Posts: 734
  • Joined: Sat Sep 24, 2016 2:12 am

Re: Segregation-adjusted WAR leaderboard

PostTue May 10, 2022 9:14 pm

Hack Wilson wrote:My gut feeling is Josh Gibson should be in the top 20 all time, probably top 10. Analytics aside, many historical accounts and contemporary witnesses say he was as great in his prime as the Babe. Yes, a shorter career. There's no way to quantify what people say about a player in those periods, but that was the feeling, the understanding about Josh Gibson. #108 is very odd by the guy doing the MLEs. So, I understand he might not have played catcher his whole career, but moving him to first base is a hypothetical, did not happen. He might have moved to left field. Who knows. We just have to take what happened as reality. This is the problem with making projections.

Anyhow, absolutely great work by you! Truly interesting.

Yeah I can see both sides. I suppose the idea is the only reason he stuck at catcher in the career he had is that the schedule was not as grueling compared to organized ball, so had he been subject to the 154-game schedule his team would've moved him off catcher to save his body. The MLE author, Eric Chalek, has forgotten more about the Negro Leagues than I'll probably ever know, so ultimately I'm not really in a position to object to his methodology. Definitely a bummer though to not see Gibson with a much fatter score, and like I said earlier, I do think ultimately the MLEs are conservative.

Digging into the Gibson MLE, he has 586 batting runs over 9060 PA, or 39 per 600 PA. Basically Miguel Cabrera without the decline.
Next

Return to Strat-O-Matic Baseball: All-Time Greats

Who is online

Users browsing this forum: No registered users and 25 guests