Justifying a teardown

Various Milkshakes

04 Apr 2026 — 2 min read

Version 1.0.0_model.3 is now available here.

Now we have gotten through the initial proof of concept it is time to work out a roadmap. Some of the key bits of feedback boil down to:

poor user experience
poor explanation of where the Elo comes from
regional competitions being treated with equal weight to world cups
regional competitions missing
prediction accuracy

I had always planned to redo the UI, away from the generic tailwind components, so designing around a improved user experience is a good justification. Although, initially, I was hoping to keep the backing data the same; after thinking about the user journey it became clear I was not capturing enough data during the Elo calculation process.

For version 0, I would capture the Elo before and after each competition. This created a large black box: athlete enters competition, athlete exits with a new Elo and here are how difficult the climbs were...

In the event results you could see scores for every problem of every round, but not the Elo, the core part of website...

If a user wants to see results, there are plenty of other sites and the official sites update their results live. There is no point trying to compete there. The event results should instead be focused on Elo changes.

Alas, the current calculation process does not capture this information. So, while I'm already ripping it apart to fix that, I may as well fix some other gripes I have with it:

Does not correctly calculate world rankings score (currently ignores continental strength)
Does not account for youth and regional Elo
Slow and cumbersome to run
Hard to generate the predictions

By the end it was easier to start from scratch with the backend. In the short term this slows down expanding new features, trading instead for a strong foundation to quickly add more fun stuff in the future.

The clean slate also allowed me to learn from all the mistakes I made hacking together the initial data model. I have cut my tables in half and refined my importing schema. My hope is this speeds up importing more regional competitions.

Leading to the final job. The Elo. Having looked through all my current Elo ratings I am unhappy with the distribution and the outcome for problem difficulties. Playing around for a while I decided to go back to the fundamentals of Elo, and define my Elo to be scoped to individual attempts at a climb.

The Elo probability is the probability that the athlete will climb the problem on this attempt.

Therefore, if the Elo of the athlete equals the Elo of the problem then the probability of them flashing the problem is 50% and we would expect on average the athlete to climb the problem in 2 attempts.

This is the rule I strive for Elo Model 3 to represent. It will need fine tuning over the coming season, but this is my current starting point.

Now that the rebuild, I will first try and load up as much data onto the website as possible, then I will publish my roadmap for upcoming features.

Thanks for being patient.

Justifying a teardown

Various Milkshakes

Read more

frnt3