The biggest issue you are running into is the implicit assumption behind your metrics, that is, a rating difference represents the same skill difference at all rating levels. For example, a 500 rating difference from 800 to 1300 represents a much greater level in skill than does 1900 to 2400. This "error" is also included in opti balance games when there are large skill differences and results in very unbalanced games even though the algorithm can assign it a very high game quality metric.
Also, it doesn't appear that your game numbers correspond to specific games. But rather that you monotonically sorted your metrics and plotted them. If you plotted specific games along the x-axis, sorted by rating disparity and plotted the skill disparity, that would be revealing.
Also, I really woudn't be worrying about the wait times for the top end of the spectrum. There simply aren't many of them and to have them all online at the same time, randomly, is extremely unlikely. But if a competitive scene develops, they will do what they already do and communicate with each other and search at the same time, in a cooperative fashion.
Having hard limits on skill differences at the different rating levels would be what I would want to see. Should just take a few if functions...