Alright, so I'm finally ready to talk about my new key feature: Adjusted Performance. This has been one of those "why didn't I think of this sooner" moments, but also "holy crap this is a nightmare to implement." Let me give you a quick taste of what it looks like mathematically:
fighter1_stat_adjperf = (fighter1_stat − fighter2_stat_opp_avg) / fighter2_stat_opp_sdev
We'll call this f1_stat_adjperf for short. The big idea is to measure how much a fighter's performance in any given fight exceeded or fell short of what we'd expect their opponent to allow. If you go in there against some unstoppable jab machine who normally forces everyone to eat 50 jabs per round, but you manage to hold them to only 20, well, that's huge. But if you never do the math to figure out what your opponent's "baseline" is, you'll just record "Fighter1 absorbed 20 jabs" and maybe think it's not that great. Meanwhile, that's incredibly good compared to the 50 jabs everyone else took. Hence, adjusted performance.
Understanding the _opp Suffix
Before going any further, let's talk about the _opp suffix. First, _opp is the post-fight stats your opponent did against you. So if you see something like f2_stat_opp_avg, that means the stat is referencing "what your opponent's opponents did against them." For example:
- f1_stat_opp_avg: The average of your opponents' performances against you in all their fights.
- f2_stat_opp_avg: The average of your opponent's opponents' performances against your opponent in all their previous fights.
- f2_stat_opp_sdev: The standard deviation of your opponent's opponents performance in those same fights, giving us a measure of the volatility or variability in their performance.
So, if your opponent has 40 strikes landed against them then f2_strikes_opp_avg is 40.
This means, to measure your adjusted performance, you compare your post-fight stat in the fight to their pre-fight _opp_avg, and then scale it by pre-fight _opp_sdev. If you get 60 strikes against someone who averages getting hit 30 times, you're 1 standard deviation above average against that person.
Why It's So Valuable
Let's be honest: raw stats can lie, or at least mislead. If a fighter's output is just "I landed 30 strikes," that doesn't tell me how many strikes they should have been able to land. If they were fighting an iron-clad defensive wizard who typically only allows 10 strikes, then landing 30 is insane. On the flip side, if your opponent is basically an open punching bag who gives up 60 strikes on average, then landing 30 is actually pretty weak.
Adjusted performance changes the game: it says, "30 strikes might be good or bad, but let's see how it compares to what your opponent usually allows." Then, for even more nuance, it's scaled by the opponent's historical standard deviation—so you don't artificially inflate your stats just because you faced someone with wide variability on a certain stat.
The Complexity: Where Do You Even Get These Numbers?
The problem with pulling off something like f1_stat_adjperf is that you actually have to calculate your opponent's _opp_avg and _opp_sdev. In other words:
- Grab all your opponent's previous fights.
- For each of those fights, figure out the stats they allowed to their opponents.
- From that, compute the average allowed stats and the standard deviation of those stats.
- Then bring that back into your current fight to see how well or poorly you did.
This is where the data pipeline can get insane, because you might have a fighter who has 15 or 20 fights, each with a different set of opponents. Some of those opponents have 30 fights apiece. Doing this for every fighter means you need to traverse a huge web of fight stats.
If you run a naive approach—like just using Panda's groupby and merges all day—it becomes unbelievably slow at scale. This is why I had to do some major refactoring, rewriting a chunk of my data pipeline to pull from a properly indexed database (Postgres, in my case). Once your data is properly structured, it's much faster to do these calculations in a single pass or via specialized queries, rather than stumbling around in memory merges.
Time-Decayed Averages & Time-Decayed StdDev
But wait—there's more. I decided to do a time-decayed average (and corresponding standard deviation) with a 1.5-year half-life. That means that a fight 3 months ago is given a lot more weight than a fight 5 years ago, which is basically ancient history in fight years.
Now the complexity is multiplied by, like, a factor of 10. Because to get f2_stat_opp_avg, I can't just average everything your opponent has done; I have to:
- Grab each fight's stat.
- Weight it exponentially based on how recently it happened.
- Sum up those weighted stats.
- Divide by the sum of the weights.
- Then do it all over again for the standard deviation.
And let's not forget: we do this for every single fight, across thousands of fights, across hundreds of fighters. That's why I always say data engineering is half the battle.
Why Bother?
So why go through this code-wrangling fiasco when your standard raw stats might be "good enough?" Because fights are context-dependent. If you have a stand-up specialist with insane takedown defense, but no one has tested it in years, your raw stats might not reflect the real picture of how they handle a brand-new style. By blending in adjusted performance stats, you're no longer stuck just describing how many strikes or takedowns a fighter landed; you're describing how well they did compared to what their opponent typically experiences—and you're discounting or boosting older fights according to their recency.
This is how we start to see nuanced differences that no raw stat alone can show. It's the difference between "Fighter A landed 40 strikes" vs. "Fighter A forced a 1.5 standard-deviation drop in Fighter B's typical striking output." That second statement captures so much more power. If you can integrate these insights into your model, you get a far more realistic prediction of how a matchup might turn out.