Better late than never.
With our server update 10 days ago (3rd November) we moved the last piece of the FAF production server to Kubernetes. Despite having hiccups on the migration, these were actually not relevant to the migration. From now on, we do not run anything on our faf-stack anymore.
Looking back into experimental repositories, the first Kubernetes related commit appeared on the 1st December 2021 (and work actually started before that). That makes 3 years! 3 astonishing years of exploring, failing and succeeding. And while we were exploring options, the k8s world came up with other solutions too and we had to re-evaluate a lot.
Time to reconsider what is in for us (admins), for developers and of course for you (the players).
The admin benefits
No more logging in to the server to run updates.
Application state synchronized with git using ArgoCD
Zero-Downtime updates (except for core components such as databases)
Auto-updating capabilities on the git repository
Developer benefits
For maintainers: New releases are automatically picked up as pull request for the gitops-stack
Lookup exact app configuration in git and change app configuration via pull request
Player benefits
Less downtimes when deploying certain parts of the infrastructure.
Less work for the admins means more time for them to focus on other issues 🙂
Downside and dark spots
Well unfortunately the change brought some downsides that we will need to improve on in the future:
Higher entry barrier for developers compared to faf-stack. All documentation points to "just run a few services in Docker", but this is no longer maintained. We are working on alternative solutions with other tools (tilt.dev).
The faf-stack was "simple enough" to launch at least some parts of it (website, api) inside our CI pipeline and run some basic test cases like "Does registration work?". These tests could not be reproduced so far. But it's basically the same problem as for the devs.
Shortcuts taken: In order to get it finished faster, we commited ourselves to a single-cluster setup storing all data once again directly mounted on the host. We did the same setup on docker and it just works good. However it removes some flexibility. Yet, there is no urgent need to change this.
In the end we can say, our journey has not ended. But the biggest milestone lies behind us and a bright future in front of us 😄