MMA-AI

About

MMA-AI is an multi-model ensemble of XGBoost and CatBoost AI models that predict upcoming UFC fights for significant profitability over time. It is 64.2% accurate, .65 logloss. See https://www.mma-ai.net/mma-ai-v3-shipped/ for more details. I am a professional hacker of 10+ years that started this project 3 years ago to learn machine learning engineering. This is version 3, a total rewrite based on 1k+ hours of study and practice. Still lots to learn, email me at danhmcinerney@gmail.com with any and all criticism, questions and improvement suggestions. I currently work as the lead AI hacker at Protect AI. Check out my RootCon talk about AI security.

https://twitter.com/DanHMcInerney
https://www.linkedin.com/in/dan-mcinerney-b5162915a/

Profit

If you started with $1,000 on 2023-01-01, you’d have $1211.97 on 2024-01-01 if you bet 1% of your bankroll on every algorithm prediction. The algorithm was not trained on any fights in 2023, they were reserved solely for testing the model against unseen data preventing this metric from being affected by overfitting. This calculation takes into account that every bet amount is the same per event and only updates the bankroll after the event is over to find the next 1% bet unit.

Strategy comparison
1/10 Fractional Kelley Criterion bankroll: $1245.57
Straight 1% bets bankroll: $1211.97
Straight $10 bets bankroll: $1201.43
1% bets on only odds favorite bankroll: $1141.13
1% bets on only odds underdog bankroll: 1062.81

Honestly, these numbers seem suspiciously high. Do I guarantee I did all these calculations right? No, but I have unit tests in place and they’re all passing so pretty sure it’s right.

Data

Data is scraped from ufcstats.com. An older dataset I used to use is here: https://www.kaggle.com/datasets/danmcinerney/mma-differentials-and-elo although it’s outdated and possibly buggy.

Features

Very extensive work went into feature engineering and selection. Thousands and thousands of features were created then narrowed down to the 81 most important after training the models many times over with all features, then selecting the most commonly high value SHAP features. Below is the SHAP values of the top 10 features in the final ensemble.

elo_difference: 0.275
Head_landed_avg_difference: 0.037
days_since_last_comp_avg_difference: 0.035
age_avg_difference 0.034
rd1_Td_attempted_avg_difference: 0.031
age_difference: 0.031
rd1_Td_attempted_rec_avg: 0.028
reach_ratio_avg_difference: 0.028
KO_loss_avg_difference: 0.027
Ctrl_time_per_td_def_avg_difference: 0.026

  • Elo score is the most predictive which makes sense because my highly customized Elo score alone is about 61% accurate in prediction.
  • Number of headshots landed is high up, a proxy for damage done in previous fights. I am somewhat surprised this is so high.
  • Day since last competition has consistently been an important factor through my years of this project. The more a fighter fights per year, the more likely they are to win. Hungry fighters are winning fighters.
  • Age is super important. Peak winning ages (not necessarily peak skill ages) tend to be around 27-30. After 33 to 34, depending on fighter mileage, there tends to be a serious decline in abilities. age_avg might seem weird at first, but note that it measures when the fighter started in the UFC. The mileage on a 30 year old fighter will be very difference if they entered UFC at 18 versus 27.
  • rd1_takedowns_attempted is interesting that it’s so high but it’s consistently been a high performing metric, even moreso than takedowns landed and consistently round 1 stats mattered more than total fight stats. Seems to imply wrestlers who control the early part of the fight generally win.
  • Reach is important, surprising nobody.
  • The number of times a fighter has been knocked out really affects their future ability. Really bad to be knocked out multiple times, probably affects fighter confidence and speeds up their age decline.
  • Control time per takedown defense: this is a measure of how fast the fighter stands up after getting taken down. This is different than takedown defense which is % of takedowns stuffed. Interesting that a fighter’s ability to stand up is actually more important than their ability to not get taken down at all.

Patterns in the features

First, based on the rest of the features, defense was almost always a more important stat than offense. Ability to not get hit, to not get taken down, to stand up quickly, all of it was better than stats like strikes per sec or accuracy. Second, wrestling is basically the most important skill. Control time stats, takedown stats, ground and pound stats, they all usually eked out a higher importance score than their twin kickboxing stats.

Exceptions

I do not include the ability to predict first time UFC fighters or women’s fights. This is because women fight in a very different pattern than men and I feel it poisons the training of the model. A model specific to women’s fights would likely perform much better but I only have so much time. Additionally, the algorithm has no way of understanding when a fighter is moving up or down in weight classes so it may be less accurate for fights where one fighter just changed weight classes.

Model Ensemble Performance Metrics

Accuracy: how often is the model correct
Precision: how often does the model predicts a win and the fighter wins
Recall: how good is the model at identifying most of the actual wins
F1 Score: harmonic mean of precision and recall
Confusion Matrix: display of true positives (predicted winner won), false positives (predicted winner lost), true negatives (predicted loser lost), and false negatives (predicted loser won)
[[True Positive, False Positive
[True Negative, False Negative]]

Accuracy: 0.659
Precision: 0.693
Recall: 0.5945945945945946
F1 Score: 0.64
Confusion Matrix:
[[103 39]
[ 60 88]]

Example of one of the earlier ensemble model learning curves:


Things I’ve learned about betting on UFC

  • The most predictive category of stats are the fighter’s defense
    • Every kind of measurement of how little a fighter is hit is fantastic for prediction: % of strikes avoided, total strikes taken, average significant strikes absorbed, it’s all golden. Defense in professional MMA is far better than striking output, striking accuracy, etc.
  • Age differential is one of the most important stats available
    • Fighters begin to have noticeable decline at around 32. By 34, the decline is precipitous. Peak age is 27 to 30.
  • A custom weighted Elo is surprisingly effective at prediction
  • Ground skills appear to be slightly more important to winning than striking skills
  • Striking output, both standing and on the ground is a great measure of a fighter’s skill, moreso than striking accuracy

Why This is Free

The reason I choose to release this to the public and not become a professional gambler is the fact that an investment in AI skills and contacts (email me danhmcinerney@gmail.com) will have an exponentially increasing value compared to short term AI gambling returns. Bookies will just limit your account if you win too much anyway. Can’t lie though, telling users who complain it’s not working to go fuck themselves and do it yourself is no small perk.