QELO Revisited by Dave McBryan
J Krishnamurthi (the creator of QELO) and I (Dave McBryan) have been discussing the QELO formula, and think we’ve come up with a simpler, improved version. We started out with the aim of tweaking the competitiveness factor, but then found some things cancel out neatly with the completeness factor, leaving us with a much simpler, yet hopefully more versatile formula. What we’re proposing is that for any game, a player’s points now be calculated as:
QELO = (Raw score * Total scores) / Winner’s Score
What does this mean?
In short, it rewards both competitiveness and completeness, just as before (see second page for explanation of how those two factors were reduced to this), but it also now has several neat properties:
For the winner(s) of a game, QELO = total score of the game (in other words, 64 minus no. of Xs).
Higher standard games therefore give higher QELO, with 64 being the maximum possible.
The most competitive game possible is a 4-way tie. That would give everyone a QELO of 4 times their score (the highest possible QELO for that score).
The least competitive game possible is one where three players score 0. Your QELO can never be lower than your score, but in that extreme scenario, score and QELO would be the same.
Where on that range your QELO falls (from 1–4 times your score) is determined by how competitive the game was, with competitiveness meaning how close to the ideal 4-way tie it was. This should be a reasonable reflection of the availability (and easiness) of bonus opportunities, and therefore how your score was affected by opponents.
Ultimately, because QELO ranges from your score up to a possible 64, you might think of it as a kind of estimate of the notional* score you’d get over all 64 questions — and indeed, for a single player game, it will be exactly that score. Speaking of short-handed games, another advantage is that there is no need to adjust the formula for 3, 2 or even 1-player games — that’s automatically taken care of by the winning score being a higher proportion of the total score.
______________________________________________________
*Emphasis on the ‘kind of’, ‘notional’ and ‘estimate’ here. It could never hope to be completely accurate — because we’ve no idea to what extent answered questions were also known by people further back in the bonus queue, we can never have a full measure of how much a foursome’s knowledge sets coincide or complement each other
Some examples to show implications
For zero-X games, winning 16–16–16–16, 20–15–15–14, 31–30–2–1 or 64–0–0–0 all give the same, maximum QELO of 64. This makes sense because regardless of strength of opponents, you’ve played an effectively perfect game by winning and answering everything they didn’t.
If totals remain the same, variations of point distribution among lower-scoring opponents no longer make any difference. So 22–20–6–4, and 22–20–7–3 would both give the winner the same QELO (52), where before there was a substantial difference. Perhaps not quite so intuitively, 22–10–10–10 would also be the same for the winner. (QELO obviously WILL be affected for those whose scores change).
A score of 10 would give 40 QELO in a 10–10–10–10 game, 39.1 if losing 11–11–11–10, and 37 if winning 10–9–9–9. Some may disagree with giving a 4th place more points than a win on the same score, but do note that those 6 extra Xs in the win all represent missed bonus opportunities that were unavailable in the losing game, so there is some justification for seeing the same score as a less impressive result. (also note that the 9s in the 10–9–9–9 game would only get 33.3, while the 11s in the 11–11–11–10 get 43).
For less finely balanced games, here’s an example of equivalent finishes in different ranks: with a score of 10, winning 10–8–7–5, coming second 12–10–8–6, coming third 15–12–10–8 or losing 18–14–12–10 would all give the same QELO (30).
For anyone interested, the maths of how we got here
The old formula was
QELO = Raw score * 100 * Competitiveness * Completeness
where Completeness = Total score/64
and Competitiveness = 1/3 (2nd score/1st + 3rd score/2nd + 4th score/3rd)
We were happy with completeness factor, but the competitiveness gave rise to some anomalies (e.g. the aforementioned big difference for the winner between 22–20–6–4 and 22–20–7–3). We were agreed that a 4-way tie was the most competitive game possible, so wanted a measure of how far away from that any result was. For a given completeness of game, the ideal 4-way tie would have everybody scoring whatever the average score is. You could take the variance or standard deviation to measure how far scores are from that average, but that leads to the same issues of volatility to small changes in distribution of points.
We realised the aggregate of the 2nd-4th place scores was more important than their exact distribution. On first impression, it might seem from the winner’s point of view, that say, 13–12–3–3 is a more competitive game than 13–6–6–6, but when it comes to bonus scoring opportunities, they’re actually pretty similar. Yes, in the first case the winner will have had to share the bonuses pretty equally with the runner up, but that’s counterbalanced by the second situation having far fewer easy bonuses coming from 3rd & 4th to take in the first place.
Comparing the average score to the winner’s score gives a measure of that aggregate, and therefore how far away from the 4-way tie (where average and winner’s scores are the same) the game was, so we tried
Competitiveness = Average score / Winner’s Score
Of course, average = total score/4, so that equates to Competitiveness = Total / 4*Winner’s Score.
It makes the full formula QELO = Raw Score * 100 * (Total score/4*Winner’s Score) * (Total Score/64)
From that you can see that using the average in the competitiveness formula has ultimately led to multiplying by the total score twice (once in competitiveness and again in completeness). We realised that this was over-rewarding high standard games (eg 16–16–16–16 would give max QELO of 1600, but 8–8–8–8 would only get 400, where we would prefer a linearly equivalent 800). We can correct this by simply dropping the completeness factor — it’s no longer needed since the total has already been accounted for in competitiveness.
That gives us:
QELO = raw score * 100 * (total score / 4*winner’s score)
Simplifying, QELO = raw score * 25 * total score / winner’s score
Finally, it became obvious that the 25 was also unnecessary, and that getting rid of it would leave us with more understandable QELOs of 0–64, with neat comparisons to game total and score in a single player game, rather than the more arbitrary seeming 0–1600. (For anyone who does want to compare this new QELO to old QELO numbers, all you need to do is multiply the new numbers by 25)
Note: FLQL is continuing to use a scaling factor of 25 for Season 5 in order to keep the order of magnitude of the scores consistent with their published GW1 scores.