Test Harness

Could the network connection problems be isolated and repeatable by building a test harness to run on the network code instead of the game.

The test harness would capture behavior and be configurable with packetloss and packet latency. Then instances of the game with this test harness could be run on a LAN of virtual machines.

This may be an ignorant suggestion, just trying to help.

The connection problems probably stem from routing and connection stability problems. Your idea is good for debugging pure software problems, but this is most likely not the (main) issue here.

A test harness can emulate network conditions. Sophisticated tools support mixing routing protocols just for this purpose. This is what network emulation tools do.

That said, I'm not an expert and this may be an ignorant suggestion. I just didn't want to be misunderstood.

I don't think it's so easy. It would need to be fully automated and reliable.
Assuming we find someone who automates setting up 2 windows VM with a functional FAF. (That seems realistic, but I have no skills to do this). And also make it work with an automatically provisioned lobby server or potentially the test server.

How do you come past the login? You need to login through the browser.
How do you select a game? How do you operate the FAF client?

Assuming we solve these issues:
How do we collect the results? How do we know if the game is over? How do we know if the game succeeded without a player dropping?

I am not aware of any tools that can do this. If someone shows up and does it, great. I am not aware we have someone who has the skills to do that in the dev team.

"Nerds have a really complicated relationship with change: Change is awesome when WE'RE the ones doing it. As soon as change is coming from outside of us it becomes untrustworthy and it threatens what we think of is the familiar."
ā€“ Benno Rice

@brutus5000 Are you saying we need the FAF core executable to be part of the test setup? In other words, the problem may be how FAF proprietary code (not the network components around it) cause the failures?

I was assuming the test would be run without FAF, only the netcode components to find the problem.

For finding if there is a failure, just instrument/trigger on the Connection panel raising. A test tool like Eggplant can do this or you can just setup a listener in the code.

Again, this may be an ignorant suggestion, so thank you for reading.

The ice adapter currently does not work without a game. The whole process is quite complicated, but the start is the following:

faf-client launches ice-adapter and tells it to listen on TCP port 12345
ice-adapter opens a socket and waits for someone to connect to TCP port 12345
faf-client launches a game instance game and tells it to connect to TCP port 12345
game instances connect to ice-adapter on TCP port 12345
game instance tells ice-adapter it's UDP port is 23456

Now ice-adapter is ready to relay messages.
This is the minimum requirement.

For a good test game instance and faf client could be mocked away with simple CLI tools, which is a thing I am currently working on.

But the game COULD be responsible for bad networking too... we don't know. Since we do binary patching on the exe lord knows what side effect a single change could have.

The overall problem with the network stability is that the whole setup is a moving target.

  • The users network connection change (LAN/WIFI, ISDN/ADSL/VDSL/Fibre/3G/4G/5G)
  • The users OS change (Windows XP->7->10-11 + always patches)
  • The users environment OS changes (firewalls, antivirus, ...)
  • The game is patched.
  • The faf client is patched.
  • The underlying Java runtime is patched.
  • The server is sometimes DDoSed.

"Nerds have a really complicated relationship with change: Change is awesome when WE'RE the ones doing it. As soon as change is coming from outside of us it becomes untrustworthy and it threatens what we think of is the familiar."
ā€“ Benno Rice