FAF/SCFA Replay Parser Library

13

For the web app written by @PattogoTehen see https://fafafaf.github.io

Intro

Some time ago I started working on a replay parser library for FAF/SCFA replays, and now it's gotten to the point where I think it deserves a forum post.

This project started as my own rewrite of a Parser library written by @dragonite and quickly evolved to include a more versatile replay inspection tool. I think this tool can be useful for many people in the FAF community who want to understand the replay format and what information is/isn't available in a replay. I hope that by sharing it, we can figure out the last few unknowns about the replay format, and that people will be empowered to explore new ways of extracting useful data from FAF replays. I have been impressed by the work that @teolicy has done on this front so far (hopefully he will make a post about his work soon).

So what the heck does this tool do? Read on to find out...

Command Line Interface

The tool currently has 3 functions:

  1. Unpack compressed fafreplay files into scfareplay files.
  2. Show basic info about the replay. Mostly constrained to the replay header.
  3. Show the replay command stream.

Unpack

I expect this to be the most widely useful feature of the tool, as a lot of other software that can parse replays only knows about the scfareplay format. The tool can handle the newest version of the fafreplay format as of release 0.5.0. Unpacking is very simple:

$ fafreplay unpack 9000556.fafreplay
Extracting...
Done
Writing 9000556.scfareplay
Wrote 12314246 bytes

Note that you do not need to unpack replays before using them with the other features of this tool. Files ending in .fafreplay will be converted in memory automatically.

Info

The info subcommand displays the sort of stuff you would expect to see for instance on the vault page. Stuff like the name of the map, the name of the players and what mods were used. It can also be configured to go into more detail with certain flags enabled.

$ fafreplay info 14210327.fafreplay 
processing replay: 14210327.fafreplay

14210327.fafreplay
Supreme Commander v1.50.3719 Replay v1.9

Operation Trident (00:25:40)
    Operation Trident

Mods
    Resource Rich v1

Team 1
    Civilians (AI) UEF
    Melanol (0) Cybran

With additional info enabled:

$ fafreplay info --mods --options 14210327.fafreplay
processing replay: 14210327.fafreplay

14210327.fafreplay
Supreme Commander v1.50.3719 Replay v1.9

Operation Trident (00:25:40)
    Operation Trident

Options
    Victory: sandbox
    Unit Cap: 1000
    Share: FullShare
    CheatsEnabled: false

Mods
    Resource Rich v1
        Increases resource production. (X2)
        Author: Gas Powered Games
        Copyright: Copyright � 2006, Gas Powered Games
        UID: 74A9EAB2-E851-11DB-A1F1-F2C755D89593
        URL: http://www.gaspoweredgames.com
        Location: /mods/resourcerich

Team 1
    Civilians (AI) UEF
    Melanol (0) Cybran

Note that this subcommand doesn't show every bit of information available in the replay header. It aims to provide more of a summary of the most relevant information.

Command stream

Here is where things get really interesting, although it requires a lot more work to extract useful information. The commands subcommand lets you explore the meat of the replay file.

By default only a small subset of commands are parsed as these are the most useful:

$ fafreplay commands 9000556.scfareplay --limit 10
processing replay: 9000556.scfareplay
Supreme Commander v1.50.3701 Replay v1.9

├── SetCommandSource { id: 0 }
├── VerifyChecksum { digest: a8377a57463c1191e0ae3447028f6d02, tick: 0 }
├── Advance { ticks: 1 }
├── Advance { ticks: 1 }
├── Advance { ticks: 1 }
├── Advance { ticks: 1 }
├── Advance { ticks: 1 }
├── Advance { ticks: 1 }
├── Advance { ticks: 1 }
├── Advance { ticks: 1 }

Total commands parsed: 10

However, you can select exactly the command types you are interested in with the --commands flag. For a list of all available commands run:

$ fafreplay commands --help

There are also a few options for printing additional information such as the offset of the command in the file (useful when paired with a hex editor) and the in game time at which the command happened.

For example to select only two specific commands and print additional info:

$ fafreplay commands 9000556.scfareplay --time --offset --commands IssueCommand ProcessInfoPair --limit 1000
processing replay: 9000556.scfareplay
Supreme Commander v1.50.3701 Replay v1.9

Time option used without including Advance commands, Advance will be parsed implicitly

0x000042c0 00:00:00 ├── ProcessInfoPair { unit: 0, arg1: "CustomName", arg2: "king_shrike" }
0x0000cc6a 00:00:06 ├── ProcessInfoPair { unit: 1, arg1: "SetFireState", arg2: "HoldGround" }
0x0000cc89 00:00:06 ├── ProcessInfoPair { unit: 0, arg1: "SetFireState", arg2: "HoldGround" }
0x0005e804 00:00:47 ├── IssueCommand(GameCommand { entity_ids: [0], id: 0, coordinated_attack_cmd_id: -1, type: BuildMobile, arg2: -1, target: Position { x: 667.5, y: 18.679688, z: 357.5 }, arg3: 0, formation: None, blueprint: "urb1103", arg4: 0, arg5: 1, arg6: 1, upgrades: Nil, clear_queue: None })
0x00062acb 00:00:49 ├── ProcessInfoPair { unit: 2, arg1: "SetFireState", arg2: "HoldGround" }

Total commands parsed: 1000

Keep in mind that determining the in game time requires parsing Advance commands and using the time option will force these to be included (but not displayed).

Some commands will have a number of fields called arg1, arg2, arg3, etc. This usually means I haven't figured out what they do or don't know a better name for them.

If anyone finds a replay containing a SetCommandCells command please send it my way as thus far I have been unable to locate any occurrences of this command.

Installation

I have compiled some binaries for Linux and Windows and hosted them on my server. If you don't feel comfortable downloading random exe's from the internet you can install pretty easily from source.

Download SHA256 hash
Linux Download for Linux 64bit a612360d15d5590987e9505ab6b3068181dd33024d0ef3edcc6ec13e0b8a7acf
Windows Download for Windows 64bit 5906646bc8c5796599747826797434713fceb56136e76f21178ebac781331f12

There are also Linux binaries built on each tagged release, but they are a little hard to track down in GitLab CI. Version 0.5.1 for Linux can be downloaded here.

From Source

The parser is written in Rust so you will need to use cargo to compile it. If you have cargo installed, you can run:

$ cargo install --git https://gitlab.com/Askaholic/faf-replay-parser.git --features=cli

You can also install the latest version published on crates.io with

$ cargo install faf-replay-parser --features=cli

but it may be less up to date than installing directly from git. I would recommend starting with the first command, and if something turns out to be broken, try the second command.

Note that you need to make sure that the ~/.cargo/bin directory is in your path to be able to run the tool once it's compiled.

Python Bindings

The parser is available as a python package for convenience. The bindings are written in Rust using pyo3.

Here's an example of using the bindings, copied over from the README with comments removed for brevety:

from datetime import timedelta
from fafreplay import Parser, commands

parser = Parser(
    commands=[
        commands.Advance,           # For the tick counter
        commands.VerifyChecksum,    # For desync detection
    ],
    save_commands=False,
)

with open("12345.scfareplay", "rb") as f:
    data = f.read()

replay = parser.parse(data)

print("Game time:", timedelta(milliseconds=replay["body"]["sim"]["tick"]*100))
if replay["body"]["sim"]["desync_ticks"]:
    print("Replay desynced!")

Note that the bindings themselves don't currently provide a function for converting from the fafreplay format to the scfareplay format because I didn't want to redistribute the needed dependencies. However, you can achieve this with the following bit of code (shoutout to @teolicy for porting this over from my Rust code):

import base64
import json
import zlib
import zstd


with open("somefile.fafreplay", "rb") as f:
    header = json.loads(f.readline().decode())
    buf = f.read()
    version = header.get("version", 1)

    if version == 1:
        decoded = base64.decodebytes(buf)
        decoded = decoded[4:]  # skip the decoded size
        extracted = zlib.decompress(decoded)
    elif version == 2:
        extracted = zstd.decompress(buf)

For more information see the project page at https://pypi.org/project/faf-replay-parser/.

Installation

Thanks to cibuildwheel and the fact that GitHub Actions can run Linux, Windows, and even MacOS instances, binary wheels for the latest version (0.5.1.post0) are available on pretty much all major platforms, and all cpython versions from 3.6 to 3.9, as well as pypy 3.6 and 3.7. Install with pip:

$ pip install faf-replay-parser

A source distribution is also available in case you are running something really exotic (or old). You will need to have rustc installed to build the package.

Rust Crate

This is actually the OG project, but it gets mentioned last because I'm guessing the overlap between people who are interested in extracting information from SCFA replays, and people who would rather do so in Rust than Python is quite small. If you happen to be such a person, let me buy you a beer sometime. Also check out this project's crates.io page for documentation.

Final Thoughts

I would be interested to hear what cool things people find in their replays, especially if they are things that seem to cause any errors with the parser. If you run into a panic (other than Broken pipe), let me know how you caused it.

I would also be curious about which commands people can find in their replays, as there are some commands which should theoretically exist, but I haven't found in any replays that I've parsed. There are also some arguments/bytes which seem to always be the same for a given command.

Furthermore, I wonder if the replay format for Forged Alliance is similar to that of the original SupCom, and if anyone has any vanilla SupCom replays that they could throw at the parser.

Of course I would also be open to some suggestions for improving the parser, ALTHOUGH please only make suggestions once you have a good grasp of the replay format. There is a lot of information that we all wish we could read straight from the replay file, but is actually impossible to obtain without fully simulating the game (which this tool cannot do because I have not re-implemented SupCom).

Project Links

Binary and Rust crate: https://gitlab.com/Askaholic/faf-replay-parser
Python Bindings: https://github.com/Askaholic/faf-replay-parser-python

This thing is good, it parses tick count real fast, it's thanks to this thing that we get to preview in-game time for replays, you should adore it

Really clean. Great work

I updated the development version to clean up the command output a bit.

Old:

├── VerifyChecksum { digest: [168, 55, 122, 87, 70, 60, 17, 145, 224, 174, 52, 71, 2, 143, 109, 2], tick: 0 }
├── IssueCommand(GameCommand { entity_ids: [0], id: 0, coordinated_attack_cmd_id: 4294967295, type_: 8, arg2: -1, target: Position(Position { x: 667.5, y: 18.679688, z: 357.5 }), arg3: 0, formation: None, blueprint: "urb1103", arg4: 0, arg5: 1, arg6: 1, upgrades: Nil, clear_queue: None })
├── LuaSimCallback { func: "SyncValueFromUi", args: Table({Unicode("id"): String("0"), Unicode("Specialization"): String("ALL"), Unicode("AffectName"): String("PowerDamage")}), selection: [] }

New:

├── VerifyChecksum { digest: a8377a57463c1191e0ae3447028f6d02, tick: 0 }
├── IssueCommand(GameCommand { entity_ids: [0], id: 0, coordinated_attack_cmd_id: -1, type: BuildMobile, arg2: -1, target: Position { x: 667.5, y: 18.679688, z: 357.5 }, arg3: 0, formation: None, blueprint: "urb1103", arg4: 0, arg5: 1, arg6: 1, upgrades: Nil, clear_queue: None })
├── LuaSimCallback { func: "SyncValueFromUi", args: {"id": "0", "Specialization": "ALL", "AffectName": "PowerDamage"}, selection: [] }

The commands should be a lot easier to read now. New versions of the pre-compiled binaries are also pushed, link in the OP.

I have downloaded a few thousand replays and used your python libary to parse all chat in them. 12MB of text was parsed for this.

I have quite a big text file for each FAF username. I know it would be better to use userID and I will in future things. Does anyone want this data?

The chat has been filtered to remove "notify" events and also "Units / Mass / Power sent"

I do plan to do some kind of node analysis on who plays with who next
who is associated with which map
association of maps with ratings

20211020_12MB_wordcloud_thousands_recent_games.png

Mavor most iconic unit confirmed.

Lol "air".

put the xbox units in the game pls u_u

im need mass pls

Could be interesting seeing more replays parsed and data analyzed/presented.

I have tested a python binding and it is OK. What kind of the data analyze do you want ?
Winning fraction? most killed fraction ? popular (unpopular) units ?

@meatontable What information can you extract here?

Relationship of unit experemental built to winning in next 10 mins would be good

Sorry for delay. I'm doing this for fun when I'm free. Of course, a detecting winning conditions is a good goal.

Found this thread after ages to be here. Unbelievable. Guys, you're great! 🙂