Please show rating changes in replay vault

If we follow the route of "we are all number nerds here and want to see all the internal details" (I disagree with this premise btw), then we should display rating as what it really is, a probability measure.
So it would be much more adequate to say that a player rating is 1000-1500 (minimum: mean minus three times deviation, maximum: mean plus three times deviation) instead of saying a player rating is 1000.
Somehow I doubt that is a change that people really want to see.

Losses are discouraging. Seeing yourself lose 300-400 “points” because you lost and then as a number enjoyer looking to see that other people seem to gain or lose 5-10 points later on makes it seem like your early losses just fucked you up forever.

Even at 13 years old when I came here with no statistical background I knew that the system was uncertain of my rating (because python client told me so + that my rating was being calibrated during the first 10 games) and I was able to take the emotional toll of having -500 global rating and laugh it off.
Are we accounting for 8 year olds with these arguments?

@brutus5000 said in Please show rating changes in replay vault:

I do not remember anybody demanding to have it in parallel with the rating system.

That's a very questionable way of wording it. Was everyone informed the league system would eventually overtake the rating system? Of course they wouldn't demand having it in parallel if that's what they expected to begin with. I for one like the leagues, having an icon and "title" along with your rating is cool. Would I have advocated for leagues knowing ratings would be removed? No

@blackyps said in Please show rating changes in replay vault:

Why is the scale of the game relevant? You correctly described one of the benefits of the league system, but I don't see how the amount of people in each division is important for this to function?

It's relevant for the same reason these other games have some other more specific rating that's used in more sparsely populated brackets. In SC2 they show the MMR, in League LP becomes more prominent once you hit masters (grand masters? not sure which) where there just aren't as many people and there are noticeable variations in skill within those brackets. I can look at the brackets on FAF right now and see over 100 trueskill different in just diamond 1, then another 100 point gap between D1 and masters. That's an arguably meaningful difference that disappears when just looking at the bracket. My understanding, which admittedly could be wrong, is that more sparsely populated league systems are more prone to this. At the very least, this is certainly true in higher brackets.

negative trueskill rating in the first games of bad/unlucky players is not visible, preventing possible demotivation

Maybe a hot take, but if you get demoralized that badly by losing your first few games of FAF then you're not the kind of person who's going to get into a game like supcom and catering to people with that mentality doesn't make a practical difference. See @TheWheelieNoob first paragraph, sums up my thoughts there.

an official placement phase makes players tolerate unbalanced games in the beginning more

Agreed, but we have this anyway now. Hiding the backend rating doesn't do much here imo.

the ability to offer people a "rank reset" by first giving them some placement games again when they return after a longer period of inactivity, where they are not faced with expectations to perform

Every time I hear this people mostly want easier games vs worse players when coming back from a break which means a trueskill reduction. Skill levels will be the same, now they're just going to be in silver edit: unranked playing vs diamond players or whatever. Seems like that feels worse, not better.

hiding the variation in the points you get for different game

related: hiding the point changes on draws. Both regularly lead to complaints because people think the system is being unfair. By abstracting this and giving +1/-1 for each win/loss and 0 points on a draw we can prevent these frustrations.

All of these benefits basically require that the league system is shown instead of the rating. If we still show both we lose a lot of these benefits.

This is a particularly nice change for draws, but it's not always going to be +-1 and people are still going to find ways to complain. They do in literally every other game I've ever seen with a league system.

So it would be much more adequate to say that a player rating is 1000-1500 (minimum: mean minus three times deviation, maximum: mean plus three times deviation) instead of saying a player rating is 1000.

Somehow I doubt that is a change that people really want to see.

To much clutter and necessary stats background for that to be meaningful to show in a lot of places, but even then it's still kind of fun for me to see that in the game lobby. So yes, I do agree again here but this is just going to the opposite extreme. Showing the absolute rating without uncertainty is the middle position that we already have.

I want to again emphasis that I do like the league system and I think it's great for FAF. I appreciate the work done here. I just don't want to also start hiding away the trueskill rating.

@thewheelienoob said in Please show rating changes in replay vault:

Even at 13 years old when I came here with no statistical background I knew that the system was uncertain of my rating (because python client told me so + that my rating was being calibrated during the first 10 games) and I was able to take the emotional toll of having -500 global rating and laugh it off.
Are we accounting for 8 year olds with these arguments?

Are you accounting for survivorship bias

Keep in mind like half the ladder games each month (back when I cared to look into this stuff anyway) are people that play 1-2 games and then stop. There are a variety of reasons. Losing just sucking in RTS is one of them. Making losing suck less is one aspect to encourage people to play longer until they can get a win. This is why a metric ass load of games with rating systems hide your rating during a “calibration” period.

I've tried to get three different personal friends to start playing FAF and none of them did past a few games here and there. Not a single one of them had an issue with negative rating. Obviously a small sample size, but I find it hard to believe that people are quitting due to bad rating at the start and that those same people would have stayed if they didn't realize just how bad their rating is.

I have no idea what to tell you, literally one of the most common complaints is people talking about going negative rating

-1

how can bros live with themselves knowing they got a weaker mental than 13 year old bully

@endranii I don't think what you mentioned (intimidation ingame) is the whole idea. Right now, almost everywhere player nicknames are shown, ratings are also shown besides them. By far the most common custom lobby title is "number+". When people don't want to play with someone, they're more likely to say "kick the <rating>" than "kick <nickname>".

It's plainly obvious this number is pretty much the only thing people pay attention to when forming a first impression of someone. It's so important that some go as far as to game the rating system to get a bigger number. Surely I don't have to mention Goodhart's Law or spell out why this is not ideal.

Giving people badges showing their progress and making that the culturally valuable currency means the devs can promote healthier playing habits (like tmm) while also protecting the global rating from manipulation, once it's dethroned from it's current prominence.

I don't think ratings (or rating changes) need to be completely inaccessible or invisible to achieve this btw, at least not yet. Making the badges way more prominent might be enough, even while giving players the option of looking up rating if they absolutely must - but I do think it should be tedious enough to find it to discourage random people in custom lobbies from doing so for every person that joins.

@ftxcommando said in Please show rating changes in replay vault:

I have no idea what to tell you, literally one of the most common complaints is people talking about going negative rating

8b57f60c-5e25-415f-861f-0151f033e0c4-image.png

Gonna need a source on that one chief

-3

I really don’t care to look into anything for you lol

@ftxcommando so like why gas every gamer because of the newbros? Cant you make the rating change invisible for dudes in placement/with less than insert number games or just make it a toggle switched to brackets by defdault? It really seems shitty to force it on everyone

Skill issue

@tomma I think the answer is even simplier. It would be much easier to give 500 rating from the start so it would be harder for newcomer to get it into negative. Or cap minimum possible rating at zero. What scares some of newbies is negative rating, not losses or rating drop, so there is no reason to hide only the last one.

@brutus5000 said in Please show rating changes in replay vault:

I do not remember anybody demanding to have it in parallel with the rating system.

https://forum.faforever.com/topic/4743/improvement-of-leaderboards/9?_=1717412060587

Not a gotcha moment but i do remember this topic

Ik heb hier geen actieve herinnering aan.

@thewheelienoob said in Please show rating changes in replay vault:

Ik heb hier geen actieve herinnering aan.

Funny, but let's keep posts in English.

"Design is an iterative process. The required number of iterations is one more than the number you have currently done. This is true at any point in time."

See all my projects:

Thank you! My apologies.

No jokes please this is a serious topic. I suggest you read the openings post before commenting...

@tomma said in Please show rating changes in replay vault:

@ftxcommando so like why gas every gamer because of the newbros? Cant you make the rating change invisible for dudes in placement/with less than insert number games or just make it a toggle switched to brackets by defdault? It really seems shitty to force it on everyone

Because I wrote to expand on what Brutus was talking about and it’s just one reason for said decision. Trueskill as a leaderboard is already an arbitrary number you know, the system gives 0 shit about your shown rating and the number everybody sees is just the attempt of the matchmaking system to give an inhouse leaderboard.

The problem is the vast majority of people TREAT trueskill like a leaderboard instead of a probability function for matchmaking instead of the reverse and so you get behaviors like people seeing a 1900 lose to a 1500 alongside other variants and everyone going baboon mode because their “carry” has now cost them potentially more “points” than usual.

This means a league can hide the recalibration of rating which can be more volatile with more of a granulation in the movement in leagues so losses are less of a mental crybaby moment. Wins will mean less too, but since people are mainly loss averse, it’s typically better when you have a wash outcome such as that.

This stuff isn’t ironed out in pvp games, and you have games that only show league with no mmr, games that show mmr but league doesn’t exist, and games that show both. But the trend in modern games is definitely moving more and more towards some sort of league system to be the basis of player comprehension of their skill.

@sainserow said in Please show rating changes in replay vault:

@tomma I think the answer is even simplier. It would be much easier to give 500 rating from the start so it would be harder for newcomer to get it into negative. Or cap minimum possible rating at zero. What scares some of newbies is negative rating, not losses or rating drop, so there is no reason to hide only the last one.

Your solution is a sign you don’t really understand trueskill. How exactly do you expect to “give” 500 rating? What is the combination of trueskill ratings you expect new players to actually start with in order to have 500 rating and how will this not just result in translating the population bell curve? You haven’t solved anything about losing games early having drastic impact on uncertainty but nobody having any clue what uncertainty actually means. They just see themselves fall down a lot when a loss is already painful.

I should also say FAF has attempted to resolve this issue itself via the matchmaker by making new players start at 500 mu and then interpolate to their actual rating over 10 games. This was to make it so that their first games tend to be against people they have a decent shot at competing with rather than the total stomp of facing someone with 1500 mu that trueskill would traditionally prefer. Like as close to a 0% chance as it can get for a new player to beat a 2000 mu player so the attempt by trueskill to be theoretically efficient with its data was really just not practically efficient.

I'll recap some of the arguments for the league system because the original discussion is some years old already and was mainly held between contributors, so might have missed quite some people.

One main thing is that we want to hide the intricacies of the Trueskill calculations, because they are quite complicated and not easily understandable by just looking at rating changes.
From this it becomes clear that we can't just keep the rating system next to the leagues, because then we defeat one of the main purposes of the change.

About the negative rating thing:
It happens regularly. I took the time to search for some examples and found multiple in just the recent months
df8ff152-d474-4552-92d0-8b47a85a06c1-grafik.png
81644939-1fb3-42e6-a843-37dd39f022ca-grafik.png
6a35ade0-1539-498b-8c65-8ec6be3f3daa-grafik.png
Capping rating at 0 is not a good solution for this. The capping can only be done visually or you fuck up the trueskill calculations. But then if you are negative and you win or lose games you don’t see any rating change at all anymore which will look like a bug to people.

I also found a nice example for confusing point changes on a draw
f95ee8f7-b82b-4779-addc-ac96367ba199-grafik.png

There are more problems: Different point changes for people in the same team and different point changes for wins in different games. I didn't collect examples for these, but I hope you all know that these happen and that they are not easily understandable. I've repeatedly seen people complain about being treated "unfairly" by the rating system.

From discussion on discord I gathered that people have two main complaints about the league system as it currently is:

  • The rating range of people in a division is too high, so they are meaningless to assess the skill level
  • top players are not really sorted by rating, robbing them of a proper leaderboard.

We can improve both problems with slight changes to the system. Divisions are already assigned to a rating range. At the moment players are placed 100 rating lower than they are to provide a sense of progression when they rise to their designated division simply by playing. This has the side effect that people are more spread over divisions. We can reduce this by placing people exactly where they are supposed to be. Then a division will hold people of a bracket that is slightly smaller than 100 rating, basically giving you the same granularity as global rating does at the moment. This should be close enough.
We can also change the grandmaster division so that the score points directly reflect the rating. This way people in grandmaster are always sorted by the underlying rating.

One other advantage of the league system is that because of the seasons we have a leaderboard that isn't cluttered with people that haven't played in years. Instead you have to be playing at least three games in the current season to show up.
We could easily increase the season length to six months for example if people think active players should be visible for longer.

For the future it's planned to show the divisions in the ingame scoreboard instead of rating soon. This is already coded.
Later we can also show them in the custom lobby, but this is not yet implemented and might take a while because the lobby code is pretty nasty to work with.