This article describes a project around the FAF ICE adapter. If you have no clue what it is I'd recommend to read this blog post first.
Some of you might think now: "Dude, why do you have time to blog? We have bigger problems! Fix the gmail issue right now!" Unfortunately my life is very constrained due to my very young children, so sometimes I can work on little projects but can't tackle server side things.
After staying away from it successfully I recently started digging into the inner workings of the ICE adapter. There are plenty of reasons that led to this decision:
- The ICE adapter is a critical part of the infrastructure. But only the original author knows the code base and claimed it to be in a bad shape (which after thorough analysis can agree to). Both are basically unacceptable facts for the long-running health of FAF.
- We tried adding more coturn servers to improve the situation on non-Europe continents, but we're facing some issues that could be best solved in the ICE adapter itself.
- The previous author of the ICE adapter has a serious lack of time to implement features.
- The ICE adapter still relied on Java 8 (the last release of Java where the JavaFX ui libraries where bundled to the Java release), but all other pieces are already on Java 17 (!). Right now it only works in the Java client due to some dirty hacks.
The ICE adapter is a very fragile piece of software as we have learnt with some attempted changes that required a rollback to previous state more than once. The problem here is that even with intensive testing in the past with 20+ users, we still encountered users with issues that never occured during testing. Every computer, every system, every user has a different setup of hardware, operating system (and patch level), configuration, permissions, anti virus, firewall setups, internet providers and so on.
Every single variable can break connectivity and we will never know why.
This led to the point that the fear of breaking something pushed us back from adding potential improvements.
Before I started refactoring I went through the code to gain a better understanding and noticed a few points:
- the release build process still relied on Travis CI which no longer works
- many libraries the ice adapter is built on are outdated
- we forked some libraries, made some changes and then never kept up with the upstream changes
- some code areas reinvent the wheel (e.g. custom argument parsing)
- ice adapter state is shared all over the application in static variables with no encapsulation
- a lot threads are spawned for simple tasks
- a lot of thread synchronization is going on as every thread can (and also does) modify the static (global) state
Almost none of this is related to actual ICE / networking logic. So improving the code here would make maintaining it easier and would also make it easier for future developers to dive into the code without much risk of breaking anything.
First steps and struggles
- First of all I didn't want to continue developing on Java 8, as it reached its end-of-life now and the language itself made some nice progress in the last 6 years. So I migrated to JDK17 which meant also fixing the library situation for the JavaFX ui. Before JavaFX was bundled with the JDK, now it comes as it's own libraries. That has a drawback though: The libraries are platform specific, thus we need to build now a windows and a linux version.
- Handling the platform specific libraries also made me migrate the build pipeline from Travis CI to Github actions (as almost all FAF projects are by now). Now we also have a nice workflow in the Github UI to build a release.
- When trying to integrate the new version into the client I found out about the hacky way how we made the current ICE adapter working with the old Java 8 version despite not having JavaFX on board. Actually the javafx libraries of the client were passed to the ICE adapter. So I could use that too! But that needed a 3rd release without JavaFX libraries inside. This required further changes to the build pipeline (we still need dedicated win/linux versions for non-java clients!).
- When testing the new ICE adapter release I was surprised as I could no longer open the debug window. But it turned out to be broken all along on previous versions. The code to inject the JavaFX libraries into the ICE adapter did not take into account, that the Java classpath separator for multiple files is different on Windows (Semicolon) and Linix (colon). So I actually fixed that, hurray!
- I replaced the custom argument parsing with a well-used library called PicoCLI. This makes reading and adding new command line arguments in code much easier.
The switch to Java 17 is a potential breaking change. Thus the 3 changes above already ended in a new release that will be shipped to you probably with the next client release.
My next attempt was to remove all the static variables and encapsulate the state of the application to avoid a lot of the multi-threaded problems that potential lurk everywhere. However doing this I struggled mainly because the ICE adapter <-> JavaFX usage:
- the ICE adapter has an unusual way of launching it's GUI after the application is already running
- Java UI always needs to run in a separate main thread some weird
- JavaFX doesn't give you a handle on the application window you launch and you can't pass it arguments
Also the UI debug logic leaks into every component and tries to grab data from everywhere. So a good and uncritical refactoring would be to rewrite the UI part...
Slim it down
I already mentioned that we have to consider the non-Java clients. @Eternal is building a new one and there is also still the Python client. For the non-java clients the ICE adapter is a pain for packaging, because they need to ship a Java Runtime (~100mb) + an ICE adapter with UI libraries (~50mb) for a very "tiny" tool.
Eternal recently asked whether it is possible to ship a lighter Java runtime. Unfortunately the current answer is "it's to much effort". Actually the Java ecosystem has acquired features to build native executables from Java application (via the new GraalVM compiler) and this was also extended for JavaFX applications (Gluon Substrate). However with JavaFX this is very complicated and requires a lot of knowledge and experience we don't have.
It would be a more realistic goal if the ICE adapter wouldn't require a Java GUI. As someone who was recently involved with some more Web development I was thinking about shipping a browser-based GUI connected to the ICE adapter via WebSocket.
We need more data
When I was designing a websocket protocol for a ice adapter <-> communication I was struck by an interesting idea.
What if we could use this data stream to track the connectivity state and gather some telemetry? This could give us insight about which connections are used most, which regions struggle the most, or if an update made things worse.
Thus I started working on a Telemetry Service that would be capable of collecting all the data of the ice adapters.
Full game transparency
But the idea started to mutate even further. Why would you want to see only your ICE adapter debug information? Maybe you want to see where a connection issue is happening right now between other players.
Also why would I bundle the UI logic with an ice adapter release, when it could be a centrally deployed web app, that can be updated independent from ice adapter releases!
So in this scenario the ICE adapter sends all of its data to the telemetry server. Players then connect to the telemetry server ui and can see a debug view of all players connected to the game and each other.
This is what I've been working on the last 3 weeks, and it's in a state where we can replace the ui and see the whole game state for all players. But we all know: Pics or didn't happen, so here is a current screenshot of the future ui (with fake data):
A new roadmap
So here is the battle plan for the future:
- Release the Java17 ICE adapter to the world.
- Finish the basic telemetry server and ice adapter logic and ship it for testing (keep the old debug ui for comparison)
- Persist telemetry data into some meaningful KPIs, so we can observe the impact of new ice adapter versions
- Drop the old debug UI and continue refactoring the ICE adapter into a better non-static architecture
- Update the core ICE libraries and see if things improve
- Try building native executables for the ice adapter
Are you interested to join this quite new project? (The telemetry server is really small, written in Kotlin with Micronaut framework). This is your chance to get into FAF development on something with comprehensible complexity! Contact me!