Status report and new roadmap for the ICE adapter

19

This article describes a project around the FAF ICE adapter. If you have no clue what it is I'd recommend to read this blog post first.

Some of you might think now: "Dude, why do you have time to blog? We have bigger problems! Fix the gmail issue right now!" Unfortunately my life is very constrained due to my very young children, so sometimes I can work on little projects but can't tackle server side things.

Why now?

After staying away from it successfully I recently started digging into the inner workings of the ICE adapter. There are plenty of reasons that led to this decision:

  1. The ICE adapter is a critical part of the infrastructure. But only the original author knows the code base and claimed it to be in a bad shape (which after thorough analysis can agree to). Both are basically unacceptable facts for the long-running health of FAF.
  2. We tried adding more coturn servers to improve the situation on non-Europe continents, but we're facing some issues that could be best solved in the ICE adapter itself.
  3. The previous author of the ICE adapter has a serious lack of time to implement features.
  4. The ICE adapter still relied on Java 8 (the last release of Java where the JavaFX ui libraries where bundled to the Java release), but all other pieces are already on Java 17 (!). Right now it only works in the Java client due to some dirty hacks.

Constraints

The ICE adapter is a very fragile piece of software as we have learnt with some attempted changes that required a rollback to previous state more than once. The problem here is that even with intensive testing in the past with 20+ users, we still encountered users with issues that never occured during testing. Every computer, every system, every user has a different setup of hardware, operating system (and patch level), configuration, permissions, anti virus, firewall setups, internet providers and so on.

Every single variable can break connectivity and we will never know why.
This led to the point that the fear of breaking something pushed us back from adding potential improvements.

Analysis

Before I started refactoring I went through the code to gain a better understanding and noticed a few points:

  • the release build process still relied on Travis CI which no longer works
  • many libraries the ice adapter is built on are outdated
  • we forked some libraries, made some changes and then never kept up with the upstream changes
  • some code areas reinvent the wheel (e.g. custom argument parsing)
  • ice adapter state is shared all over the application in static variables with no encapsulation
  • a lot threads are spawned for simple tasks
  • a lot of thread synchronization is going on as every thread can (and also does) modify the static (global) state

Almost none of this is related to actual ICE / networking logic. So improving the code here would make maintaining it easier and would also make it easier for future developers to dive into the code without much risk of breaking anything.

First steps and struggles

  1. First of all I didn't want to continue developing on Java 8, as it reached its end-of-life now and the language itself made some nice progress in the last 6 years. So I migrated to JDK17 which meant also fixing the library situation for the JavaFX ui. Before JavaFX was bundled with the JDK, now it comes as it's own libraries. That has a drawback though: The libraries are platform specific, thus we need to build now a windows and a linux version.
  2. Handling the platform specific libraries also made me migrate the build pipeline from Travis CI to Github actions (as almost all FAF projects are by now). Now we also have a nice workflow in the Github UI to build a release.
  3. When trying to integrate the new version into the client I found out about the hacky way how we made the current ICE adapter working with the old Java 8 version despite not having JavaFX on board. Actually the javafx libraries of the client were passed to the ICE adapter. So I could use that too! But that needed a 3rd release without JavaFX libraries inside. This required further changes to the build pipeline (we still need dedicated win/linux versions for non-java clients!).
  4. When testing the new ICE adapter release I was surprised as I could no longer open the debug window. But it turned out to be broken all along on previous versions. The code to inject the JavaFX libraries into the ICE adapter did not take into account, that the Java classpath separator for multiple files is different on Windows (Semicolon) and Linix (colon). So I actually fixed that, hurray!
  5. I replaced the custom argument parsing with a well-used library called PicoCLI. This makes reading and adding new command line arguments in code much easier.

The switch to Java 17 is a potential breaking change. Thus the 3 changes above already ended in a new release that will be shipped to you probably with the next client release.

My next attempt was to remove all the static variables and encapsulate the state of the application to avoid a lot of the multi-threaded problems that potential lurk everywhere. However doing this I struggled mainly because the ICE adapter <-> JavaFX usage:

  • the ICE adapter has an unusual way of launching it's GUI after the application is already running
  • Java UI always needs to run in a separate main thread some weird
  • JavaFX doesn't give you a handle on the application window you launch and you can't pass it arguments

Also the UI debug logic leaks into every component and tries to grab data from everywhere. So a good and uncritical refactoring would be to rewrite the UI part...

More requirements

Slim it down

I already mentioned that we have to consider the non-Java clients. @Eternal is building a new one and there is also still the Python client. For the non-java clients the ICE adapter is a pain for packaging, because they need to ship a Java Runtime (~100mb) + an ICE adapter with UI libraries (~50mb) for a very "tiny" tool.

Eternal recently asked whether it is possible to ship a lighter Java runtime. Unfortunately the current answer is "it's to much effort". Actually the Java ecosystem has acquired features to build native executables from Java application (via the new GraalVM compiler) and this was also extended for JavaFX applications (Gluon Substrate). However with JavaFX this is very complicated and requires a lot of knowledge and experience we don't have.

It would be a more realistic goal if the ICE adapter wouldn't require a Java GUI. As someone who was recently involved with some more Web development I was thinking about shipping a browser-based GUI connected to the ICE adapter via WebSocket.

We need more data

When I was designing a websocket protocol for a ice adapter <-> communication I was struck by an interesting idea.

What if we could use this data stream to track the connectivity state and gather some telemetry? This could give us insight about which connections are used most, which regions struggle the most, or if an update made things worse.

Thus I started working on a Telemetry Service that would be capable of collecting all the data of the ice adapters.

Full game transparency

But the idea started to mutate even further. Why would you want to see only your ICE adapter debug information? Maybe you want to see where a connection issue is happening right now between other players.

Also why would I bundle the UI logic with an ice adapter release, when it could be a centrally deployed web app, that can be updated independent from ice adapter releases!

So in this scenario the ICE adapter sends all of its data to the telemetry server. Players then connect to the telemetry server ui and can see a debug view of all players connected to the game and each other.

This is what I've been working on the last 3 weeks, and it's in a state where we can replace the ui and see the whole game state for all players. But we all know: Pics or didn't happen, so here is a current screenshot of the future ui (with fake data):

e79325b9-6d33-4894-991b-666cb347509d-image.png

A new roadmap

So here is the battle plan for the future:

  1. Release the Java17 ICE adapter to the world.
  2. Finish the basic telemetry server and ice adapter logic and ship it for testing (keep the old debug ui for comparison)
  3. Persist telemetry data into some meaningful KPIs, so we can observe the impact of new ice adapter versions
  4. Drop the old debug UI and continue refactoring the ICE adapter into a better non-static architecture
  5. Update the core ICE libraries and see if things improve
  6. Try building native executables for the ice adapter

Are you interested to join this quite new project? (The telemetry server is really small, written in Kotlin with Micronaut framework). This is your chance to get into FAF development on something with comprehensible complexity! Contact me!

"Nerds have a really complicated relationship with change: Change is awesome when WE'RE the ones doing it. As soon as change is coming from outside of us it becomes untrustworthy and it threatens what we think of is the familiar."
ā€“ Benno Rice

Looking at this and all the problems I am surprised the old ICE lasted this long. Considering how outdated it all is probably also would raise a security concern.

In regards to: "But the idea started to mutate even further. Why would you want to see only your ICE adapter debug information? Maybe you want to see where a connection issue is happening right now between other players."

I assume you are referring to this from an Admin management sort of side where you (as an admin) could see connection issues and this is not something every player could easily look at. As depending on which side either user or admin will depend on what data is viewable.
Only mentioning this in regards to GDPR legislation as having all users able to see current IP's they're connected to mid game can cause a fairly large security risk.

Ras Boi's save lives.

@lord_asmodeus
The UI does not expose any ips or even any information not already available if you were to join the lobby of the game or open the client. It simply shows the connection status between players.

There are 2 independent layers:

  • What the telemetry server collects
  • What the debug UI actually shows

The UI does not and probably will not show any ip addresses, even though as a participant of the game you are connected to all the players so you see their ip anyway šŸ˜‰
Even the telemetry server does not collect anything sensitive apart from user ids and user names so far, as it is a drop-in replacement for the current ICE adapter debug window.

"Nerds have a really complicated relationship with change: Change is awesome when WE'RE the ones doing it. As soon as change is coming from outside of us it becomes untrustworthy and it threatens what we think of is the familiar."
ā€“ Benno Rice

Yeah, although you can find the IP's already if you look in the correct places, putting it all in one place would make things easier for someone to find if they wanted to.
As Sheikah said only show the status of the connection.

So my personal recommendation would be to keep the IP's not displayed.

The telemetry server I like the idea of as well.

My Kotlin skills are a little rusty but if I get free time from work I will check the GitHub and see what's going on.
As far the ICE adapter coding wise my Java is not terrible but I will leave that alone for now rather than interfere.

Ras Boi's save lives.

11

Progress Report

āœ” Release the Java17 ICE adapter to the world.

The Java 17 transition is done and already shipped with client pre-releases. So far we heard no complains, so it will come with the client update at the end of the month.

āœ” Finish the basic telemetry server and ice adapter logic and ship it for testing (keep the old debug ui for comparison)

The ICE telemetry server is actually running on production right now in a very basic version. It keeps no state and cleans up games after the last player left. So for now it only replaces the functionality of the debug UI window.

So here is how to trigger it (once released): Click on the tray icon and open the web ui.
c29d2f65-776a-4e53-946b-ed1e75a921df-image.png

And then you end up on the current ui:
accd1083-c58a-4313-98a6-84a612c8687b-image.png

Unfortunately not all features are supported yet (such as Coturn region or rtt), but some of these require more complex changes in combination with the clients.

We will release the first version with the client pre-release starting in November.

"Nerds have a really complicated relationship with change: Change is awesome when WE'RE the ones doing it. As soon as change is coming from outside of us it becomes untrustworthy and it threatens what we think of is the familiar."
ā€“ Benno Rice

Awesome work on this Brutus! Takes a brave man to dive into legacy code.

Works like a charm

cf048a1c-c808-4264-bae1-9269b38b8ba9-image.png

Is this available?

faf.mabula.net maintainer.

12

7xlbo3.jpg

"Nerds have a really complicated relationship with change: Change is awesome when WE'RE the ones doing it. As soon as change is coming from outside of us it becomes untrustworthy and it threatens what we think of is the familiar."
ā€“ Benno Rice

Again?

The previous non-java (I think c++ and nodejs) version were never fully working, so I wouldn't say that counts as a rewrite.

"Nerds have a really complicated relationship with change: Change is awesome when WE'RE the ones doing it. As soon as change is coming from outside of us it becomes untrustworthy and it threatens what we think of is the familiar."
ā€“ Benno Rice