How long should FAF keep old replays?

If people wish to watch AI games, it would be recent AI games. I never see anyone talking about watching AI games (even viking) but even if those people exist, it’s why I included the timeframe criteria. I sincerely don’t think the replays I mentioned would upset anyone if they were deleted.

I also don’t see the need for a poll here. We should decide what quantity of free space is “good enough” and chisel down until we meet it.

If there's a way to automatically tell if a replay desyncs then I'd be in favor of getting rid of those and replays with only a single person in it as others have said. I don't think we should get rid of any valid replays outside of those criteria if we can help it. I'm also very curious about what it would take to increase the size of the servers disk space, seems like that should be straight forward assuming there's money to do so.

The only people watch AI games are AI Devs whom, as they often ask for replays when something happened in the game. To confirm or otherwise the issues. Tbf modded games, the only folks I know who care to watch old mod replays are the modders even then. Basically same situation “send replay” (well i personally couple times a week search scta to see if any scta games were played but I’m weird)

I’m a shitty 1k Global. Any balance or gameplay suggestions should be understood or taken as such.

Project Head and current Owner/Manager of SCTA Project

I would do it differently for different categories. For example: Remove all replays less than 5 min long and older than 2 months (in case something a moderator needs to look at happened and the report is being worked on, I just assumed 2 months max)
Remove single player games after 1 month; if it's interesting the player should just save it themselves. Really no added value in keeping them for long.
Keep all rated and watchable (no desyncs) games forever, maybe put them on a different server just for storage after 5 years or so (so you can still download them if you need to, but they don't take up space on the main server).
There are probably a few more categories that I didn't think of that would make sense.

AI games can be things like team survival games. Some of those games are posted to YouTube. People care enough to put the videos online, that means people care about the games. Maybe very few people, other than the players themselves, care about solo games involving AI. But you can't say as a general rule that nobody cares about AI games.

The cost of making a backup of the games before deleting them from the server is minimal. What even is the cost of 300GB of cloud storage? Just save them somewhere. Maybe in 5 years the cost of storage will be so small that we can restore all the replays to the server at basically no cost. Or some other way to make the replays available. There is no good reason to delete history even if most of the history is crap. Once it is deleted we can never get it back.

Do we even have a complete backup of all replay files? Just load them into a single 1TB drive and send it to a FAFer's house. 1-2 copies like this would ensure that history is not lost.

Cloud storage for our purposes is around 5€ per month per 100GB. Previous calculation said we have a growth of about 20GB per month, so the costs are constantly increasing. Just keeping everything forever is not a viable option.

Also running it on a private copy is essentially the same as deleting for 99% of the users who just see "ah, replay not available, well bad luck".

"Nerds have a really complicated relationship with change: Change is awesome when WE'RE the ones doing it. As soon as change is coming from outside of us it becomes untrustworthy and it threatens what we think of is the familiar."
– Benno Rice

Would it be possible to look up the storage savings from the various proposed solutions?

@Brutus5000 I wouldn't mind volunteering to be personally responsible for responding to people who want really old replay files for some reason. We wouldn't even need cloud storage, I would just keep a single 1TB or 2TB drive with zero ongoing costs to FAF. If the files were transmitted to me electronically it would be close to zero cost to FAF. That situation could then continue essentially indefinitely.

I doubt there is very much interest by players in those old replays, so we're probably talking about 1 request per month. Even if I have to spend 15 minutes a day responding to those old requests, I wouldn't mind.

Arma for The Giver 2021

arma for Archive Councilor 2021

Before deleting any replays, consider the following:

  • New format for .fafreplay, compressed achive of the scfareplay-file and the json-data without the final b64-encoding
Comparsion of size:
scfafreplay: 1222 KB
fafreplay-current: 260 KB (21.2% of original size)
fafreplay-zip: 196 KB (16.0% of original size)
fafreplay-7z:: 141 KB (11.5% of original size)
  • Publish the replay vault (maybe IPFS?) so people can mirror / clone the replays for historic reasons.

  • Create a service that collects local replays from players to be able to repair current broken / missing ones in vault

EDIT:
Did some more testing and the compression numbers were a bit messed up earlier. I've updated the post now with my results.

+1 on the service,
you could use some sort of cheaper object storage like AWS glacier or azure archive as place to dump it + a cron job to just upload xGB every + some sort of proxy api to pull the data when requested (probably almost never).

EDIT: looks like wasabi could be relatively cheap https://wasabi.com/

I'm in favor for deleting them after 3 - 5 years. i don't value old replays as much because they were played with an older version of the game. Very good replays will be casted most likely. Also if you want to watch old gpgnet replays 🙂 https://www.gamereplays.org/supremecommanderforgedalliance/replays.php?game=44

Hi guys, thanks for your feedback so far.

Fortunately we finalized some work on replay stuff that was mostly done already, so we'll compress replays with zstandard instead of (zip encoded with base64). Support will come in the next client release and we'll need to find a way to convert all the old replays. This will be a big breaking change for some tools out there, but it's the best chance we have.

Some words on the alternative suggested here:
While renting storage in the cloud with some cheap providers is possible of course it massively increases develop and (more important) maintenance efforts which I'm not willing to pull. Storage in our Hetzner datacenter would be possible though as it can be natively integrated into the existing server infrastructure. Nevertheless we still strive for a cost neutral solution if possible.

Eventually this topic will reappear in a few years again, so the discussion is not off the table forever.
We'll see which storage savings we can achieve with the current solution.

"Nerds have a really complicated relationship with change: Change is awesome when WE'RE the ones doing it. As soon as change is coming from outside of us it becomes untrustworthy and it threatens what we think of is the familiar."
– Benno Rice

Will this break the replay parser?

@FemtoZetta said in How long should FAF keep old replays?:

Will this break the replay parser?

Perhaps, but it should not be difficult to find a Java library that can decompress the ztd file format and add that to the parser itself. So it shouldn't be difficult to update the parser to handle this. If anyone is actively maintaining the parser, it should be an easy fix.

Even if there's no way to update an existing tool, it should be relatively easy to decompress the ztd files "by hand" and re-compress them in the fafreplay file format in order to hand them off to a software tool.

OR someone could even make a software tool to convert replay files in the new (ZTD) format into the old (zip) format. Automating the process of converting files from a more-efficient format to a less-efficient format so that old tools can use them is kind of silly but it would solve the immediate problem of tools being broken.

What is "the replay parser"?

"Nerds have a really complicated relationship with change: Change is awesome when WE'RE the ones doing it. As soon as change is coming from outside of us it becomes untrustworthy and it threatens what we think of is the familiar."
– Benno Rice

Yes, it will break for the new format. Someone will have to update it, somewhere around line 430 in page source. This diff should help with that.

Oh please keep all replays. Can we throw money at buying more storage?