Matchmaker Algorithm Feedback Thread

WIth the last server update we deployed a new matchmaker algorithm. Note that the algorithm is only used for teamgames, not for 1v1. So currently only the 2v2 queue uses it, but the upcoming additional queues will also use it.
The new algorithm will treat new players as 0 rated, because it uses mean - 3 * sigma, just like the displayed rating, as the rating to find matches. The new matchmaker not only tries to create teams that both have the same total rating, but it also tries to create matches where all players have roughly the same rating.
In theory this should prevent matches where e.g. a 400 and a 1600 rated player are in the same game. However the system gives everyone a bonus for each time they have not been matched. So during relatively quiet hours you could still get more unbalanced matches. You could also see these constellations, if they queued up as a premade team, obviously.
As a final note, especially when the queue is more busy, it is expected that there is a certain "float" of unmatched players after matching. This is because the algorithm tries to always find the best matches. The float is roughly the same size of the games to be found, so around 4 for 2v2 and 8 for 4v4.

Please use this thread for any feedback you have regarding the matchmaker algorithm including the 1v1 queue.

However the system gives everyone a bonus for each time they have not been matched. So during relatively quiet hours you could still get more unbalanced matches. You could also see these constellations, if they queued up as a premade team, obviously.

This is a very good idea, but it is slow by its nature. This means that if a bad match is available, you still need to wait maybe 15 minutes to get it. When I play 2v2 with a friend we (700 & 1200) would rather play against 2x 1300 now than wait so long we give up on playing for the night. We know that after 11 pm matchmaking queue begins to empty while we have time left for 1 game, so we would probably want to get the bad match as fast as possible.

Could we get some kind of a "Wide rating match" checkbox? If no match is possible, then the system will take all players that have checked this option and immediately make the least-worst possible matches among those players. - still applying some limits, there is no point in 0 rated vs. 2500 game.

Taking my example, if both the 1300's and us all checked this option then we get a match.

would rather play against 2x 1300 now than wait

There’s really not much point in discussing this because people’s opinions range the entire spectrum. Many would rather wait longer for a better match than to go into a game knowing they’re gonna lose. I think we just have to keep tweaking it until everyone is moderately unsatisfied with it xD. And based on the lack of complaints I’ve seen over the last year or so, it seems like we’re pretty close except for the top rating bracket where there really are too few players for there to be much of a point in waiting at all. We’ll see what people comment in this thread though.

@askaholic I think you missed key of my comment or suggestion. A checkbox to enable wide matching at that specific time, not always.

Sometimes waiting longer is not an option anymore because the game evening is almost over. We already know that there are fewer and fewer players from 11pm on, by the time the automatcher widened its search range enough to include the aforementioned 1300's they might already have logged out - if we didn't log out ourselves.

@Valki I didn't miss it, I just don't like the idea. Probably it would be better to just do it automatically to avoid cluttering the interface. There are already like 10 buttons on that page as it is, I'm not convinced that this is so important that it needs its own button. I also really don't like how it shoves the internals of the matchmaking algorithm in your face. Now if you're a new player you gotta figure out how the heck the matchmaker algorithm works before you know whether or not you should tick that box, and its not something that you can even learn by just trying it since it will be really hard to notice what its doing (there is no immediate feedback, and in many situations its completely irrelevant). Not to mention the added work of having to program that in both the client and the server.

The idea of a wider rating match mode might be ok, but I think it's unnecessary to have every player need to make that selection manually.

@askaholic if you got my point it helps me see that if you address it directly or at least acknowledge it.

In theory this should prevent matches where e.g. a 400 and a 1600 rated player are in the same game. However the system gives everyone a bonus for each time they have not been matched. So during relatively quiet hours you could still get more unbalanced matches.

I very much hoped we would avoid the possibility having such imbalanced games. They were terrible before, so much so I stopped playing 2v2s alltogether.

Being in Australia, most times are quiet times. If I have to wait a long time to get a balanced game then so be it. But waiting a long time to only get an unbalanced game is the worst possible outcome for me.

I appreciate opinions will vary and wish you luck in developing an algorithm that the majority of the community will be happy with.

askaholic said in Matchmaker Algorithm Feedback Thread:

Now if you're a new player you gotta figure out how the heck the matchmaker algorithm works before you know whether or not you should tick that box

No, you wouldn't have to understand that, you would just have to know that you get matched with people close to your rating or not. That could for example be explained in 1 or 2 senteces when you hover over the button.

Imo you should never get unbalanced games, no matter how long you have been searching. Getting no match is better than getting a terrible one. Other people just want to get fast games and are fine with whatever. There's no way to please both unless you add an option.

Now 2k bracket would have no games at all xddd

@femtozetta This has already been discussed to death, but there are many problems with that thinking. First off for new players it’s simply impossible to create a balanced game since they don’t have a stable rating yet. They simply need to get any game regardless of how balanced it is so that their rating hs a chance to settle. And as @fruitieN00b pointed out, it’s also a problem at the top ratings too because of lack of players.

I also don't like the idea of a button. It is a bandaid for the problem that currently the matchmaker doesn't do a good job.
Your example match of 700 + 1200 vs. 1300+ 1300 has a whopping 700 rating disparity. For established players this shouldn't match, because the chance for the first team to win this is extremely small.
At the moment it does match eventually, because there is a bug in the current server version that rates extremely badly balanced games better than it should. If you want to play this anyway you could ask in aeolus who is queued up and if they want to play a custom instead, but this doesn't warrant an extra ui element.

To give you all a better understanding of how the algorithm works, I prepared some graphs for you. I created a script that passes artificial data to the matching algorithm and plots the results. In the bottom right you can see the rating distribution used for that run. Newbies are people with at most 10 games played in that rating. The distribution is based on real data based on global. Newbies get extra bonuses to match faster. This is especially important, because they can drop to extremely low ratings if they lose their first games and they would have the same difficulty to get matched like top players. We don't want them to get stuck, so we need to help them a little to get games. "search" in the diagram refers to a queued up party, that can contain multiple players. In the top right you can see the wait time of each search based on their average rating.
The graph in the top left shows some metrics about the created games. The games where sorted beforehand, so the game number doesn't correlate to when the game got created. Rating disparity is the difference in total rating between the two teams. Rating deviation is the standard deviation of the ratings of all the participating players in the game. Skill difference is the rating difference between the lowest and highest rated players in the game. It is roughly 2,5 to 3 times the standard deviation. This is just mathematics. You can ignore the rating deviation line and instead focus on the rating disparity and skill difference, because these are the "hands on" game metrics. Finally the graph in the lower left depicts the wait time again. This time sorted by wait time. Honestly the most interesting part are the averages and means written in the top right corner of the plot.
All plots have fixed y ranges (except the rating distribution). I did this, so it is easier to spot differences. This means however that some very high values are cut off. I consider these outliers, so it doesn't matter too much, just keep it in mind. I still have the maximum values in my spreadsheet.

This should be enough introduction, let's start with the bucket team matchmaker. This one was used until the server update 10 days ago.

The currently running configuration of the new matchmaking algorithm looks like this:

As you can see it improved the overall game quality and equalized the wait time a bit. It still suffers from extremely bad games, just like the bucket team matchmaker. The reduced wait time outliers is what you experience as getting very bad games when you are high rated or are queueing during quiet hours. This is mainly because of the mentioned bug, but we can still get some more performance by tuning the parameters.
This is what I came up with, that will be available with the next server update:

As you can see the lines in the top left don't go off the charts anymore, so we got rid of the horrible games. The general skill difference also improved a bit. The top players are back to being in queue forever if no suitable match can be found. The careful use of bonuses makes sure that we match more aggresively on the lower rated end. By the way, this is why you see these spikes at the end of the curves. They are the newbie matches being more lenient with balance.

Because the algorithm is configurable we will even be able to see some of these improvements before the next server update.

I hope I could answer some of your questions. I know that I covered a lot of stuff and I just glanced over some of some of the details to keep the post readable, so if you have any follow up questions, don't hestiate to ask.

The biggest issue you are running into is the implicit assumption behind your metrics, that is, a rating difference represents the same skill difference at all rating levels. For example, a 500 rating difference from 800 to 1300 represents a much greater level in skill than does 1900 to 2400. This "error" is also included in opti balance games when there are large skill differences and results in very unbalanced games even though the algorithm can assign it a very high game quality metric.

Also, it doesn't appear that your game numbers correspond to specific games. But rather that you monotonically sorted your metrics and plotted them. If you plotted specific games along the x-axis, sorted by rating disparity and plotted the skill disparity, that would be revealing.

Also, I really woudn't be worrying about the wait times for the top end of the spectrum. There simply aren't many of them and to have them all online at the same time, randomly, is extremely unlikely. But if a competitive scene develops, they will do what they already do and communicate with each other and search at the same time, in a cooperative fashion.

Having hard limits on skill differences at the different rating levels would be what I would want to see. Should just take a few if functions...

These are the exact same differences as far as how Trueskill works. To be 2300 you need a positive win rate against 2200s, to be 400 you need a positive win rate against 300, both at the same rate.

The confusion likely comes from the fact that the lower the rating, the higher the typical deviation is. Meanwhile all the people at 2200 have thousands of games with minimum deviation and so their skill is "settled" and seems definitive because everything looks like the expected result.

These are the exact same differences as far as how Trueskill works.

I guessed as much. Which is excatly the issue....

Well the new algorithm uses displayed rating now (μ -3σ), not trueskill mean, so that probably makes a difference.

Also, it doesn't appear that your game numbers correspond to specific games. But rather that you monotonically sorted your metrics and plotted them. If you plotted specific games along the x-axis, sorted by rating disparity and plotted the skill disparity, that would be revealing.

That's right. But I don't think it really makes any difference. What do you expect to see? I expect to see basically the same graph again, with more noise, because these metrics are correlated. You can't have all players be the exact same rating when you have a big rating disparity between the two teams.

Having hard limits on skill differences at the different rating levels would be what I would want to see. Should just take a few if functions...

What limits would you like to have?

What limits would you like to have?

Based on the highest rated player in the game:

less than 1000: no limit on skill difference
1000<1500: maximum skill difference of 500
more than 1500: maximum skill difference of 800

As a first pass, obviously be adjusted depending on results and such.

You are the first person I've ever seen to suggest that the difference between a 2200 and 2300 player is less than a 700 and 800 btw; I actually misread your first post because I thought it was the classic post that everyone below 1000 is the same skill level.

Also, these suggestions would make you match with an even larger difference than what currently exists. It would basically double the search range for 1500 players. Or are you suggesting that teammates can only be within these rating limits? In which case I don't understand these rating limit brackets.

Why would you have no skill limit, followed by a 500 skill limit, followed by an 800 one? That doesn't even make any logical sense. It can't be based on playerbase size because the group with 70% of the players (<1000) has no limit, but it also can't be based on any sort of rationale about skill level being larger or lower based on your place on the rating spectrum.

Also, these suggestions would make you match with an even larger difference than what currently exists.

This is demonstrably untrue because these limits can only prevent games with large skill differences: impossible to produce more... My suggestion is the imposition of hard limits where currently none exist. I cannot even begin to try and untagle what you think I am suggesting...

Why would you have no skill limit, followed by a 500 skill limit, followed by an 800 one?

I'm still interested in your explanation for this.
Also it would really make the discussion easier if you replied to more than one question per post.