Please show rating changes in replay vault

Reminds me of Youtube getting rid of the dislike counter.

FAF board of directors and shareholders should focus on fixing connection issues and important things like that instead of changing things that don't need changing.

League system never proclaimed itself to be “more accurate” and there is zero point to a paper on said matter because it isn’t a tool for matchmaking. It’s a tool to enable more progression because the whole goal of trueskill is a 50% win rate, that’s it. Divisions can enable people having progression even at a 50% win rate.

There is also basically nothing in trueskill about it proclaiming to be a great leaderboard tool. That is a much wider term than a matchmaking algorithm and divisions enable that wider definition to be tinkered with without ruining the matchmaking experience.

@ftxcommando

  1. Do you think "a tool for matchmaking" and "a great leaderboard tool" are two different things? If so, why not show both of them for each replay? It's very useful, not redundant.
  2. Do you think "a great leaderboard tool" has lower requirements for accuracy? Is it expected to be fun instead of accurate? In fact, I don't mind it. Fun is great, but we need a value to evaluate and track our progress. I want to know how much progress I have made compared to myself seven days ago, and how much contribution does each game have on the evaluation results. Therefore, showing ratings and rating changes is really essential.
  3. In addition, there is another example to show that it is useful. When I review my replays, I usually pay more attention to the games that I lost to players with ratings much lower than me. However, if we only show divisions and hide ratings, I won't know that. If a replay shows that I lost to a player in the same division, what can I know? Maybe I was at the top of the division and he was at the bottom of the same division, or it could be completely the opposite. In short, we can't know how much is the gap between two players if we are only told they are in the same division or adjacent divisions.
  4. Therefore, showing rating changes is always beneficial because we can know more information. The rating system has been used for many years, and it is far more than just a matching system. If we hide its details and make it only a matching system, making it not available for tracking our progress or knowing whether a game loss is expected or so on, that will be a completely wrong decision and retrogress

Thank you for giving some detailed reasons. I can't address all individually at this point in time, but I still want to give some clarifications and explanations.

There seems to be some misunderstandings how the division system works. It is not used for matchmaking at all. Matchmaking happens based on rating.
The leage system does take opponnent's ratings into account, just indirectly. Every division has a rating range associated with it and people get placed accordingly. Now if it were to happen that the rating increased but not your winrate (for example because you got matched with higher rated players on average), then at some point your rating is higher than the range of the division you are in. You will then get extra points for each win until you are promoted to the division you are supposed to be in. Similar for the case that your rating drops. This way divisions always correlate with a certain rating range, so they can be used as a rough skill estimate.

the problem with trueskill rating

You gaining 12 points after a win doesn't mean that your skill improved by 12 points.
Your level of play varies from day to day. It depends on how well you slept, how exhausted you are from the day, what your current mood is, etc. When play against someone with 1120 rating that doesn't necessarily mean that that person is a harder opponent for you than the 1070 rated guy in the next game, even though he has higher rating.
1120 is just on average better than 1070. For a single game they are too close to make a definite call. In lower ratings this is exacerbated by the fact that people might be good at some aspects of the game (air, small micro maps) and bad at others. So depending on the map they might vastly over- or underperform relative to their rating.
Trueskill is still accurate on average but frequently people read too much into it, because they don't understand the limitations of the system.

This is no surprise because we don't even show rating as what it is. In reality it is always a gaussian distribution. Instead of displaying rating as a single number it would be closer to reality to show it as a range like 1120-1610. And I am not exagerating the range. This literally the range of established players where it's 99,7% (+- 3 sigma) certain that the real rating is in this interval. The size of this interval should show you that thoughts like "but this player has 40 more rating" are pretty meaningless in reality.

Using the lower end of this range to establish a leaderboard is a completely arbitrary decision but necessary if you want to use rating as a leaderboard because you need comparable numbers. You can't as easily compare different gaussian distributions.

So no, showing rating changes the way we did before is not as beneficial as you think because it carries less information than you think it does.

I hope that I could show with my post that divisions give a good enough estimation of skill while getting rid of the noise that is prone to overinterpretations.

additional examples why we need to hide rating

With the league system we can do adjustments that we can't with just the rating system. For example we can place people lower than they should be on purpose, so will be dragged up by the bonus points simply by playing. That is the enhanced progression FtX is talking about. The downside of this is that the rating spread in a single division increases. Now, it seems that people value it more to have a gauge of skill of players, so I made preparations to disable this soon, so in the future you should only see people of similar rating in a particular division.

We also can adjust things like 1v1 rating being lower for the same skill. We can't easily foresee the consequences of changes to the rating system. On the other hand it is trivially easy and it won't have side-effects to change the rating range that divisions are associated with. This way we can bring different matchmaking queues more in line with each other.

Both these adjustments make it necessary to hide the underlying rating to really make it work

To add one point onto what BlackYps wrote (I wrote a big post that got eaten by a forum error and lost motivation) there isn’t really a problem with having trueskill as a variable on some player info tab that people can see go up and down for themselves. Likewise, it could potentially have usage as a website leaderboard to assist with tournaments. But having it casually adjacent to leagues in the UX will make leagues pretty pointless because there is already a huge community culture around utilizing rating. People will keep using what they have used and new players will continue to get inoculated to care about their trueskill and not their league. Trueskill is a probability heuristic that we only really need to refer to in cases of seriously granular concerns. For example, tournament seeding. Whether a dude is 1230 or 1248 is irrelevant in some singular game.

Well guys,
I see that we speak about completely different things,
We tell you, that we enjoyed to see +green numbers in a replay vault.
You tell us about better estimation of a certain player skill level. This thread is not about proper estimation of a player skill.
It is about replay vault display that we were enjoying to see.

Also I do not mind if your better sleep, better rest and your good family relationship motivates you to get more global rating kappa.
Like improve your real life to be worth to get better in FAF man haha. 😉

Did you even understand what I wrote?
If you don't properly engage with what I write I see no reason to further engage with what you write

I did understand your points completely, thank you for your detailed answer above..

@blackyps

Let me reply to your ideas one by one.

First, +1 for victory, -1 for defeat and 0 for draw is inaccurate. For example, in 1v1 ladder, if I am always matched to opponents rated 200 higher than me, and I get +15 for victory and -5 for defeat, I only need a 25% win rate to hold my rating. This makes a lot of sense because the rating system thinks my probability of winning is only 25%. If I win 2 games and lose 6 games, my rating won't change because my performance meets the expectation. However, if I get +1 for victory and -1 for defeat, I will lose 4 scores, which is inaccurate.

You may argue that we won't be always matched to opponents with lower ratings or higher ratings, but in fact this usually happens. Sometimes I only meet lower rating players one day, and only meet higher rating players another day, because there are few people (sometimes only 1 or 2 around my rating) playing 1v1. So the +1/-1 score is inaccurate and unstable, but the True Skill rating doesn't have this kind of inaccuracy and instability, just like the example above. Therefore, the True Skill rating is much better for us to evaluate our performance and track our progress. Please don't hide it.

What's more, this problem won't be solved even if we consider longer periods of time. If you open https://faforever.com/leaderboards , you will see there are players that have played thousands of games but their win rates are not very close to 50%. Let's take StormLantern for an example: He has played 5065 games but his win rate is only 42.8%, which means the average points he won from each victory is about 1.336 times the average points he lost from each defeat, so we can infer that he met opponents with higher ratings more often. If we uses +1/-1 score, his skill will be underestimate. This is not only inaccuracy, but also incorrectness.

By the way, I don't understand your sentence "You will then get extra points for each win until you are promoted to the division you are supposed to be in". Do you mean I will get two or more points for each victory if my rating is higher than my division? I think this is only a rough compensating and doesn't fix the inaccuracy. If my guess about the meaning of your sentence is correct, I can provide more explanations.

This reply is long enough so I post it. I will reply to your other ideas later.

@xinren said in Please show rating changes in replay vault:

Do you mean I will get two or more points for each victory if my rating is higher than my division?

Yes. In your case, being always matched with higher rated opponents, you would slowly lose score until you drop down one division. You would then get 2+/-1 for each win/loss, stabilizing you. You would maybe float around the edge of these two divisions because you will lose this bonus once you are promoted again and the cycle begins from the top.

This might seem inelegant, but it's highly unusual that you get consistently matched with people of significantly higher rating than yourself.

I'm wary of the winrates in the leaderboard because we had database problems with reported wins in the past, so the reported winrates can be inaccurate.

@blackyps

You said "You gaining 12 points after a win doesn't mean that your skill improved by 12 points." I agree with this. But if we review more games in a longer period, it will be meaningful. For example, if I was 900 one month ago and 1100 now, I can know that I made progress, and I can also know the main reason from rating changes. Maybe last month I usually lost to lower rating players because of a certain mistake, which greatly lowered my rating, but now I don't make this mistake. Maybe I was not able to defeat higher rating players, but now I am able to do that, which has significant contribution to my rating. This example shows that rating changes are meaningful.

Let me talk about an example unrelated to this game. When I was in high school, after each math exam, the teacher would tell us the average score of the whole class for each question. In this way, each of us can know whether he is doing better or worse than the average for each question. Perhaps a certain question is very simple, and we should not make mistakes. Perhaps a certain question is difficult, so doing half of it correctly is good. Based on the comparison with the average score, I can know which questions I did well and which ones I didn't do well, so that I can pay more attention to those questions that I didn't do well. This example tells us that providing us with more information can help us make better progress. Playing games on FAF is also like this.

For example, if I was 900 one month ago and 1100 now, I can know that I made progress, and I can also know the main reason from rating changes. Maybe last month I usually lost to lower rating players because of a certain mistake, which greatly lowered my rating, but now I don't make this mistake. Maybe I was not able to defeat higher rating players, but now I am able to do that, which has significant contribution to my rating. This example shows that rating changes are meaningful.

Then you would also be two divsions higher, for example from Gold III to Gold I to reflect this indeed meaningful change in your rating

@blackyps said in Please show rating changes in replay vault:

Yes. In your case, being always matched with higher rated opponents, you would slowly lose score until you drop down one division. You would then get 2+/-1 for each win/loss, stabilizing you. You would maybe float around the edge of these two divisions because you will lose this bonus once you are promoted again and the cycle begins from the top.

Thank you for your answer. I will talk about two problems here.

The first problem is that even with the mechanism of "get 2+/-1 for each win/loss", our league system can not become a good tool to track our progress. True Skill rating is much better. This mechanism only gives a rough compensation, but doesn't fix the inaccuracy. When I win a game, my score and rating both increase, but I can only see my score. The rating change is invisible, but it does have an effect. When my rating reaches the range of next division, I will get 2 points for one victory, which is its effect, but it's delayed. If I reach higher divisions after some games, it's difficult for me to know how much each game contributes to this result, because rating changes are hidden. Therefore, hiding rating changes does not make the problem simpler, on the contrary, it only makes the problem more complex.

The second problem is that, one reason for hiding rating changes is because some people have questions about them, so you decide to display a simple thing instead of ratings, right? However, the division system heavily relies on the rating system. Players may ask questions like "Why did I win 2 points instead of 1 point last game?". Therefore, we won't actually solve any problem, but introduce more problems because we introduce more concepts and mechanisms.

@blackyps said in Please show rating changes in replay vault:

Then you would also be two divsions higher, for example from Gold III to Gold I to reflect this indeed meaningful change in your rating

Yes, but I don't know how much each game contributes to this result. I don't know whether it's mainly because "last month I usually lost to lower rating players because of a certain mistake, which greatly lowered my rating, but now I don't make this mistake.", or "I was not able to defeat higher rating players, but now I am able to do that, which has significant contribution to my rating", or I have made general progress so I have higher win rate against different players, or other reasons. The details are all hidden, so it's harder for me to analyze the reason.

All those reasons you quoted have basically nothing to do with rating. If you lost because you made a mistake, you catch it regardless of rating. Higher win rate against different players also doesn’t have anything to do with their rating. If you keep beating players, you will get a significant contribution to rating. Again, whole point of trueskill is a 50% win rate. And no, the win rate stats on the leaderboard are just terrible data and shouldn’t have been there in the first place because of problems in how game results were recorded on FAF.

I do not understand this mentality at all. Are you saying you just don’t analyze losses as much when a player way worse than you wins? It doesn’t matter that he played better? All that matters is whether he has a higher trueskill?

Whenever I’ve watched replays what rating gives me as a barometer is when to expect the worst mistakes combined with whether I can expect the player to bounce back from said mistake. Plenty of 1800+ players screw up starts, but it’s something like putting 1 or 2 too few engies on energy production. An 800 just forgot to scale power after 5 pgens. Sometimes a 2000 forgets too, but they recover from it using the tools available to them much faster which may or may not enable them to still keep pace in the game depending on the mistakes of the opponent. The fundamental error is still there to see, though.

@blackyps said in Please show rating changes in replay vault:

This might seem inelegant, but it's highly unusual that you get consistently matched with people of significantly higher rating than yourself.

I'm just giving an example. "consistently matched with people of significantly higher rating than yourself" is a bit exaggerated. But in my last 20 1v1 matches, there are only 4 games (20%) that my opponents had higher rating than me (opponents are Aranei, Maximbes, Ternencebroad and Mielus), and there are 16 games (80%) that my opponents had lower rating than me. And it is obviously lower in average.

In 1400+ ladder it is quite common to match with a player 200 or even 300 rating above or below yourself. Then, because no one else joins the queue, y'all proceed to play a 10 game series

One issue is that the points required to change division is so high, and FAF games take so long, that now there doesn't feel like there is much progression. You win 5 games in a row, it takes 3 hours, and your division is still the same. If we remove the numerical rating we need divisions that change more often.

-1

@balfron We have a few dozens of divisions, no reason to have a bloat of them. Rating is good for getting some innacurate but exact number, division is good as a milestone.

@blackyps said in Please show rating changes in replay vault:

Can someone in detail explain to me what you are getting from seeing the exact rating change please? Because I frankly don't get it

Because outside of rating change I can immeditely see rating value. I can see that at the moment of the game 1 player was 1650 and other was a 1230 without need to open a replay. That's a hella important information whenever you're checking some replay, so it should be easily available.

This post is deleted!