How long should FAF keep old replays?
-
arma for Archive Councilor 2021
-
Before deleting any replays, consider the following:
- New format for .fafreplay, compressed achive of the scfareplay-file and the json-data without the final b64-encoding
Comparsion of size: scfafreplay: 1222 KB fafreplay-current: 260 KB (21.2% of original size) fafreplay-zip: 196 KB (16.0% of original size) fafreplay-7z:: 141 KB (11.5% of original size)
-
Publish the replay vault (maybe IPFS?) so people can mirror / clone the replays for historic reasons.
-
Create a service that collects local replays from players to be able to repair current broken / missing ones in vault
EDIT:
Did some more testing and the compression numbers were a bit messed up earlier. I've updated the post now with my results. -
+1 on the service,
you could use some sort of cheaper object storage like AWS glacier or azure archive as place to dump it + a cron job to just upload xGB every + some sort of proxy api to pull the data when requested (probably almost never).EDIT: looks like wasabi could be relatively cheap https://wasabi.com/
-
I'm in favor for deleting them after 3 - 5 years. i don't value old replays as much because they were played with an older version of the game. Very good replays will be casted most likely. Also if you want to watch old gpgnet replays https://www.gamereplays.org/supremecommanderforgedalliance/replays.php?game=44
-
Hi guys, thanks for your feedback so far.
Fortunately we finalized some work on replay stuff that was mostly done already, so we'll compress replays with zstandard instead of (zip encoded with base64). Support will come in the next client release and we'll need to find a way to convert all the old replays. This will be a big breaking change for some tools out there, but it's the best chance we have.
Some words on the alternative suggested here:
While renting storage in the cloud with some cheap providers is possible of course it massively increases develop and (more important) maintenance efforts which I'm not willing to pull. Storage in our Hetzner datacenter would be possible though as it can be natively integrated into the existing server infrastructure. Nevertheless we still strive for a cost neutral solution if possible.Eventually this topic will reappear in a few years again, so the discussion is not off the table forever.
We'll see which storage savings we can achieve with the current solution. -
Will this break the replay parser?
-
@FemtoZetta said in How long should FAF keep old replays?:
Will this break the replay parser?
Perhaps, but it should not be difficult to find a Java library that can decompress the ztd file format and add that to the parser itself. So it shouldn't be difficult to update the parser to handle this. If anyone is actively maintaining the parser, it should be an easy fix.
Even if there's no way to update an existing tool, it should be relatively easy to decompress the ztd files "by hand" and re-compress them in the fafreplay file format in order to hand them off to a software tool.
OR someone could even make a software tool to convert replay files in the new (ZTD) format into the old (zip) format. Automating the process of converting files from a more-efficient format to a less-efficient format so that old tools can use them is kind of silly but it would solve the immediate problem of tools being broken.
-
What is "the replay parser"?
-
-
Yes, it will break for the new format. Someone will have to update it, somewhere around line 430 in page source. This diff should help with that.
-
Oh please keep all replays. Can we throw money at buying more storage?
-
Imo, no need to keep replays older than 3 years no matter what they are. If they were good they are already casted w/e.
Even then I wouldn't really keep replays older than 2 years unless they are from ranked games. It just seems pointless to keep so much when it's useless. -
I would suggest to keep only the special old games. But I think there are no stamps like "Tournament". I also think that no one will sort their favorite replays. I guess you can just save replays of certain players, purely for memory?
But in general, I do not see any point in these replays, the game is growing up, the old tactics and methods of the game can not be applied to the current realities.
I am in favor of removing all replays since the last old patch.
And further sorting them with some parameters
- Removing replays with one player
- Removing replays with a sandbox
- Removing of replays of the company
- Removing "blacklisted" maps (gap, astro, etc)
- Removing replays with desync
- Removing replays after (optional) 4/8/12 months
-
@ZLO I've been searching for this parser for ages! The old link on google 404s
edit: still seems to be broken for me -
@Cascade You should try to open it with vpn. Works fine for me without vpn
-
Hi,
Bitbucket deleted my old mercurial repos (https://bitbucket.org/blog/sunsetting-mercurial-support-in-bitbucket).
Moved what i could save to Github.Please update your links to: https://fafafaf.github.io/
Btw. sometime the CORS header got removed from https://content.faforever.com/faf/vault/replay_vault/replay.php hence it cant load the files directly by replay id.
-
@PattogoTehen We changed a lot of urls, but I couldn't make a pull request against bitbucket as it was gone. I'll do one against your Github repository.
The CORS header issue is something we raised as a bug ticket in our reverse proxy, it might vanish in the near future (or already does if you use replay.faforever.com, not sure).
-
I can offer to archive all replays and make them accessible to others as well. Size and traffic do not matter for my google drive.
edit: Others can make backups as well
-
Is it possible to create criteria by which you could keep some replays for longer... IE if it was a game with stronger players it gets kept longer... Or if it was a game with a certain number of downloads... Or 5 star reviews...
That might be a cool thing. It would be a decent way to raise the overall value of the replay library.
-
So due to the server disk running full causing an outage we have prioritized recompressing all replays to Zstandard without base64 encoding. This reduced total replay size from 677 GB to 406 GB (-40%).
With the next server update we'll also activate the new compression for the replay server. So also the growth of new replays will be decreased by 40%.This was actually much more than we expected.
Also we found a lot of broken replays in terms of corrupt files that need to clean up.So as of now the server has 280 GB usable disk space free which should suffice for another 3-5 years.