FAForever Forums
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Login
    The current pre-release of the client ("pioneer" in the version) is only compatible to itself. So you can only play with other testers. Please be aware!

    Lets Talk about Rating

    Scheduled Pinned Locked Moved Suggestions
    27 Posts 15 Posters 634 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C Offline
      Caliber
      last edited by Caliber

      Suggestion ; reduce the amount of rating gained or lost from each game, curently averages out for me about 10 points per game.

      Aim ; Reduce the large amount of rating variation players can experiance through randomness

      why ;
      fgjhrsyjys.png

      This is my rating chart, its an example of the huge amount of rating variation a player can experiance, I havent changed the way I play or gotten any better or worse from what I can tell its purely a result of the random encounters in team games.

      reducing the points lost or gained from each game will reduce the massive variation experianced and will produce more stable and accurate representation of player skill through rating and produce more equaly balanced games overall. Meaning that a player will have to play consistantly good or bad in order to gain or lose points and not just be the result of a few badly balanced games.

      N 1 Reply Last reply Reply Quote -4
      • Brutus5000B Offline
        Brutus5000 FAF Server Admin
        last edited by Brutus5000

        Sure. Let's ignore all the efforts of scientific analysis by professional statisticians that microsoft paid to develop the Truskill system and trust your gut feeling instead.

        Rating gain/loss is not a reward/punishment for winning/losing.

        Also the rating number you see is just an simplified approximation of what the system uses internally.

        This was discussed a million times. We are not going to fiddle with the Truskill algorithms and nobody came up with a better system than Truskill yet (e.g. ELO isn't better suited for our use case)

        He said, "I've been to the year 3000
        Not much has changed, but they live underwater
        And your great-great-great-granddaughter
        Is playin' FAF, playin' FAF"

        1 Reply Last reply Reply Quote 5
        • N Offline
          Nomander Balance Team @Caliber
          last edited by

          @Caliber The parameters are tuned using this forum post: https://forums.faforever.com/viewtopic.php?f=45&t=11698 It's much higher effort than yours so I'm inclined to trust it more. No offence, just letting you know what you're competing with.

          Your graph doesn't have any axes. Here's your global rating for the past year (the client's graph grid is very faint unfortunately):
          7a269d82-86e4-4d97-9b90-59b9ae8dbe09-{C705BFDC-26EB-4E34-9BC5-D2ADECA45E5B}.png

          You variance of 300 rating takes ~30 net losses (with the 10 rating change/game you gave yourself). That's a number of games that I think cancels out any "random encounters" in team games. I've had large rating swings as well, and in my experience it's due to changing the group of players I play with, the map type, or just being in a different mood in terms of playstyles (strong aggressive t1-t4 unit use vs weak ultra-greedy eco or overly aggressive t1 spam). Maybe some of that sounds familiar to you? I don't want to thoroughly analyze your replays, but I did notice that you were playing much less 3v3/4v4 when you were higher rated global. That could make you naturally gravitate to a playstyle that is better for global mapgens with high rating disparity among the players, which is different from the smaller, lower eco, lower imbalance 3v3/4v4 games.

          1 Reply Last reply Reply Quote 2
          • ZLOZ Offline
            ZLO
            last edited by

            I always thought that bumping up uncertainty slightly before every game is the FAF invention / implementation of TrueSkill. (Afaik that is why very high rated player can lose a tiny bit of rating after winning against low rating player)

            Why? because there were tons of complains about rating moving too slowly and getting very stale over time 😄

            TA4Life: "At the very least we are not slaves to the UI" | http://www.youtube.com/user/dimatularus | http://www.twitch.tv/zlo_rd

            1 Reply Last reply Reply Quote 1
            • C Offline
              Caliber
              last edited by Caliber

              @Brutus5000 I dont beleive I mentioned any desire to change true skill.

              Rather slowing it down a bit

              this would mean that in my situation my chart would look more like these

              dsjy.png

              Without such a large change in rating so quickly

              IndexLibrorumI 1 Reply Last reply Reply Quote -1
              • StrydxrS Offline
                Strydxr FAF Association Board
                last edited by

                I've heard enough, one gorbilgillion rating reduction for non Faf-Elite.

                Mods, can we ban this guy?

                1 Reply Last reply Reply Quote 1
                • IndexLibrorumI Offline
                  IndexLibrorum Moderator @Caliber
                  last edited by

                  @Caliber said in Lets Talk about Rating:

                  @Brutus5000 I dont beleive I mentioned any desire to change true skill.

                  Rather slowing it down a bit

                  Huh?

                  "Design is an iterative process. The required number of iterations is one more than the number you have currently done. This is true at any point in time."

                  See all my projects:

                  C 1 Reply Last reply Reply Quote 0
                  • FtXCommandoF Offline
                    FtXCommando
                    last edited by

                    FAF’s system is the result of fiddling with the results of professional statisticians. Our tau is like triple the value those professional statisticians recommended which is what this guy is complaining about. Still a good change though, otherwise people would be gaining or losing 3 points a game. Supposed to win 30 games in a row to climb 100 rating?

                    C 1 Reply Last reply Reply Quote 0
                    • C Offline
                      Caliber @FtXCommando
                      last edited by Caliber

                      @FtXCommando said in Lets Talk about Rating:

                      FAF’s system is the result of fiddling with the results of professional statisticians. Our tau is like triple the value those professional statisticians recommended which is what this guy is complaining about. Still a good change though, otherwise people would be gaining or losing 3 points a game. Supposed to win 30 games in a row to climb 100 rating?

                      Yes pretty much bang on, seeing as the average rating increase/decrease for a standard team game of 10 players results in a 10 points increase or decrease and you describe it as being a 3 point change then perhaps a middle ground may be a suitable alternative.

                      also seeing as the outcome of the game becomes vastly more difficult to influance the players exist in the game maybe create a larger change based off of player numbers in game, such as I see points in a 4v4 change being at 11 and larger team games can still be 10 a smaller change based of off the number of players maybe a worthwile thing to took into, perhaps more like 2 points per player in game decrease so like this,

                      1v1 15 points
                      2v2 13 points
                      3v3 11 points
                      4v4 9 points
                      5v5 7 points
                      6v6 5 points

                      obvisously this excludes player uncertainty calculations in this example but you get the idea

                      as an example, seeing as the lower the player count the more you impact the game and the rating change would be more inline with individual performance.

                      @Nomander in the link you gave the guy that created the "optimimum" parameters (axle) even states that is is very volitile, which is exactly what im trying to say

                      N 1 Reply Last reply Reply Quote 0
                      • C Offline
                        Caliber @IndexLibrorum
                        last edited by

                        This post is deleted!
                        1 Reply Last reply Reply Quote 0
                        • FtXCommandoF Offline
                          FtXCommando
                          last edited by

                          You can’t do what you’re talking about because you’re using shown rating for these values and trueskill as a system doesn’t care about the shown rating value. Every trueskill adjustment is a tinkering of your mu and your sigma, saying to “just” adjust it by X amount based on the size of the game is impossible because trueskill doesn’t internally communicate like that.

                          If you wanted a lower tau than argue for it to be 2% of sigma rather than 3%.

                          1 Reply Last reply Reply Quote 0
                          • SkratS Offline
                            Skrat
                            last edited by

                            Hi!
                            If you want to reduce rating variations, play several global 1x1 rating games. For the next few months, your rating variations will be like this. xD

                            image.png

                            Sorry for my English. I use translator

                            1 Reply Last reply Reply Quote 2
                            • N Offline
                              Nomander Balance Team @Caliber
                              last edited by

                              @Caliber said in Lets Talk about Rating:

                              @Nomander in the link you gave the guy that created the "optimimum" parameters (axle) even states that is is very volitile, which is exactly what im trying to say

                              The purpose of trueskill is to correctly predict games, which includes getting players up to their skill level quickly and having the rating value be accurate enough after reaching that level. If you can prove that some parameter adjustment improves its predictions, that would be convincing. Axle linked a github for his work, so it should be possible to adapt for someone savvy.

                              1 Reply Last reply Reply Quote 0
                              • R Offline
                                rampeer
                                last edited by rampeer

                                I wanted to pick it up, as I have some expertise (once I made a program that computes Elo rating for chess puzzles), and wanted to contribute to FAF somehow.

                                But realized that it's hard to devise any sort of controlled experiment. Some players get better over time, some players get worse, expecting the rating to be stable is wrong, and it's impossible to differentiate between rating drift due to misconfiguration and . Maybe let different AIs battle each other?..

                                But my hunch is that current change-per-game is too high. These zigzags are not normal, just look how rating graph looks for chess:
                                image.png
                                People stay in FAF for many thousands of games, so I do not buy "it's for people to get to their real rating quicker" argument.

                                Also, 4.5% draw probability is just wrong. Who has that many draws?

                                S 2 Replies Last reply Reply Quote 1
                                • waffelzNoobW Online
                                  waffelzNoob
                                  last edited by

                                  that's because chess doesn't have wildcard teammates and opponents with a rating of 1800 that can perform anywhere between 1200 and 2000. global rating varies so much because the performance of your opponents and teammates is almost completely random. if you play 1v1 ladder, like in chess, your rating won't go up and down more than 100 rating

                                  frick snoops!

                                  1 Reply Last reply Reply Quote 1
                                  • R Offline
                                    rampeer
                                    last edited by rampeer

                                    I do not believe you. 1v1 rating graph looks just as jagged as global:

                                    image.png

                                    I am certain it's possible to sacrifice a bit of ramp-up time (note spike on the left) for overall smoother graph and more stable rating.

                                    Also, what about draw probability? It feels like all the coefficients are off; will look into it later.

                                    waffelzNoobW KnownSniperK 2 Replies Last reply Reply Quote 1
                                    • maudlin27M Offline
                                      maudlin27
                                      last edited by

                                      One problem is when rating covers a wide range of games as one number

                                      Most obviously an issue with global - you can play like a 1700 on one popular map and like an 800 on another
                                      Making rating take far longer to adjust makes it in turn far harder for people to try different games/maps, and could hurt retention. It also takes longer for a returning player’s rating to realise they’re not as good after a multi-year gap and only playing infrequently vs when they were playing constantly.

                                      So it feels better to err on the side of faster rating adjustments than slower (outside new players) to me. I also don’t see having a smoother graph as being all that big a benefit compared to the downsides.

                                      M27AI and M28AI developer:
                                      https://forum.faforever.com/topic/2373/ai-development-guide-and-m27ai-v81-devlog
                                      https://forum.faforever.com/topic/5331/m28ai-devlog-v294
                                      M28 trophy holders: Radde, Yew (Radde trophy, v285) and Zwaffel (Sladow trophy, v284)

                                      1 Reply Last reply Reply Quote 0
                                      • waffelzNoobW Online
                                        waffelzNoob @rampeer
                                        last edited by waffelzNoob

                                        @rampeer said in Lets Talk about Rating:

                                        I do not believe you. 1v1 rating graph looks just as jagged as global:

                                        image.png

                                        I am certain it's possible to sacrifice a bit of ramp-up time (note spike on the left) for overall smoother graph and more stable rating.

                                        Also, what about draw probability? It feels like all the coefficients are off; will look into it later.

                                        that is a graph that deviates no more than 100 rating up and down, as i said. the rating is fairly solid around 1500-1700 and that is normal because humans are inconsistent. one day we play well, one day we don't. one day we run into an opponent whos having a good day, one day we don't. same thing happens in chess, where you also lose/gain 10 rating per game btw. and there is no problem with this system because going on a 10-game winstreak and only getting 50 rating sucks

                                        and this 100 rating deviation is an entirely different scenario than what caliber described with his global rating - he lost 400, not 200 (i now see this 400 was actually exaggerated - he was 1850 and dropped to 1550, not 1900 to 1500).

                                        frick snoops!

                                        1 Reply Last reply Reply Quote 2
                                        • S Offline
                                          Sainse Balance Team @rampeer
                                          last edited by

                                          @rampeer said in Lets Talk about Rating:

                                          Also, 4.5% draw probability is just wrong. Who has that many draws?

                                          The draw probability is estimated by FAF to be 10%. The math behind it is already mentioned above. Lower draw expectation would actually increase the jumps.

                                          1 Reply Last reply Reply Quote 0
                                          • C Offline
                                            Caliber
                                            last edited by

                                            One other point i would like to raise that i forgot about in the opening statement, was that at least through my experiance is that most games seem to be won/lost quite heavily, indecating that although most games are rated at 90% + balance they often go so very heavily one sided, I would say that only around 1 in 10 games actually seem to be a good balance that last a while at least.

                                            1 Reply Last reply Reply Quote 0

                                            Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                                            Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                                            With your input, this post could be even better 💗

                                            Register Login
                                            • First post
                                              Last post