@nex
I don't know, kind of feels like we're exiting the vampire castle.
A rigorous analytical solution for a better team rating is beyond my mathematical ability without some heavy research, but that doesn't mean there isn't one floating around out there.
Also, given the setup we have, it's also possible to silently test alternative systems (TrueSkill was tested in the same way against Elo, graded based on their percent of wins predicted correctly).
Mathematically the metric we want is the players ability to play the game and cooperate with their team well
I would refer to that as a goal/outcome--but now I know where you're coming from. And I agree, that is the outcome I'm looking for.
I can spitball about what metrics would be good for FAF (in game score? mass efficiency? kills?) but can't confidently say that using any of them wouldn't be abused/go sideways.
@FtXCommando
A meta-level correction would be to incorporate map statistics. Map gen maps have a larger effect, maps that other people play a lot have a larger effect (and vice versa), playing the same map over and over has a smaller effect*.
*Could grade this based on size, terrain, and distribution of mass spots to catch all the gap clones.
EDIT: After asking my friend Mr. GPT-4, looks like TrueSkill2 is a good starting point for all the above.