rrt/docs/re-workflow.md

7.3 KiB

Reverse-Engineering Workflow

Goal

Produce durable, version-safe analysis that first explains high-level control loops and subsystem handoffs, then feeds the function-by-function Rust rewrite and later DLL-based replacement work.

Standard Loop

  1. Confirm the target binary and record its hash.
  2. Refresh the exported baseline artifacts.
  3. Update docs/control-loop-atlas.md with newly grounded loop roots, dispatchers, cadence points, state anchors, or subsystem handoffs.
  4. Analyze in Ghidra.
  5. Cross-check suspicious findings in Rizin or with CLI tools.
  6. Update the function map with names, prototypes, ownership, confidence, and loop-relevant notes.
  7. Commit regenerated exports and notes that would help future sessions.

Baseline Export

Use the committed helper:

python3 tools/py/collect_pe_artifacts.py \
  rt3_wineprefix/drive_c/rt3/RT3.exe \
  artifacts/exports/rt3-1.06

This export pass is expected to produce:

  • binary-summary.json
  • sections.csv
  • imported-dlls.txt
  • imported-functions.csv
  • interesting-strings.txt
  • subsystem-inventory.md
  • function-map.csv

For the startup-init milestone, run the Ghidra headless export as well:

python3 tools/py/export_startup_map.py \
  rt3_wineprefix/drive_c/rt3/RT3.exe \
  artifacts/exports/rt3-1.06

Optional flags:

python3 tools/py/export_startup_map.py \
  rt3_wineprefix/drive_c/rt3/RT3.exe \
  artifacts/exports/rt3-1.06 \
  --depth 2 \
  --root entry:0x005a313b \
  --root bootstrap:0x00484440

This startup pass is expected to add:

  • ghidra-startup-functions.csv
  • startup-call-chain.md

The raw CSV now includes root provenance columns:

  • root_name
  • root_address

Context Export

For branch-deepening passes after the initial root mapping, use the committed context exporter:

python3 tools/py/export_analysis_context.py \
  rt3_wineprefix/drive_c/rt3/RT3.exe \
  artifacts/exports/rt3-1.06 \
  --addr 0x00444dd0 \
  --addr 0x00508730 \
  --addr 0x00508880 \
  --string gpdLabelDB \
  --string gpdCityDB \
  --string 2DLabel.imb \
  --string 2DCity.imb \
  --string "Geographic Labels"

This pass is expected to add:

  • analysis-context-functions.csv
  • analysis-context-strings.csv
  • analysis-context.md

The function CSV captures target function metadata plus caller callee and data-ref summaries. The string CSV captures matched strings plus their code or data xrefs. The Markdown report keeps the human-readable disassembly excerpts that are useful for the next naming pass.

Use this exporter to close missing edges in the atlas before using it for leaf-function refinement.

Branch RE Kit

For deeper branch work after the atlas identifies a narrow unknown, use the CLI RE kit:

python3 tools/py/rt3_rekit.py \
  pending-template-store \
  rt3_wineprefix/drive_c/rt3/RT3.exe \
  artifacts/exports/rt3-1.06

Optional seed override:

python3 tools/py/rt3_rekit.py \
  pending-template-store \
  rt3_wineprefix/drive_c/rt3/RT3.exe \
  artifacts/exports/rt3-1.06 \
  --seed-addr 0x0059c470 \
  --seed-addr 0x0059c540

This pass is expected to add:

  • pending-template-store-functions.csv
  • pending-template-store-record-kinds.csv
  • pending-template-store-management.md

The function CSV captures the seed cluster plus adjacent discovered helpers in the same branch. The record-kinds CSV captures the pending-template dispatch-record destructor switch cases and their inferred payload cleanup shapes. The Markdown dossier groups the branch into lifecycle buckets such as init destroy lookup prune and dispatch.

This branch dossier is intentionally narrower than the atlas. Reach for it only when the broad loop map is already clear enough that a missing branch blocks the next high-level conclusion.

Ghidra Workflow

  • Create a local project for the canonical 1.06 executable.
  • Name the project after the binary version, not just RT3, so address notes stay version-safe.
  • Import the executable without modifying repo-tracked files.
  • Treat Ghidra as the primary source for function boundaries, control flow, and decompilation.
  • Local launcher on this host: ~/software/ghidra/ghidraRun
  • Local headless entrypoint on this host: ~/software/ghidra/support/analyzeHeadless
  • Headless project state should live under ghidra_projects/ and remain untracked.
  • The committed wrapper defaults to the entry and bootstrap roots but can be pointed at additional roots when a milestone needs it.

Rizin Workflow

Use Rizin as the fast second opinion when you need to:

  • check section layout, entrypoints, and imports from the CLI
  • confirm function boundaries or calling conventions
  • script quick address-oriented inspections without reopening the GUI

Runtime Debugging

Static analysis comes first. Use winedbg only after the local Wine runtime is confirmed to work with the project prefix and a 32-bit target process. Runtime traces should be recorded back into the function map as corroborating evidence, not treated as a replacement for static exports.

Current host note:

  • env WINEPREFIX=/home/jan/projects/rrt/rt3_wineprefix winedbg --help works.
  • RT3 launches successfully under /opt/wine-stable/bin/wine when the current directory is rt3_wineprefix/drive_c/rt3.
  • Launching from the wrong working directory can make the process exit cleanly because the game expects its relative asset paths to resolve under C:\\rt3.

That means runtime work can proceed, but startup commands should always be recorded with the working directory included.

Naming Rules

  • Names should prefer behavior over implementation detail when behavior is known.
  • If behavior is only partly known, keep a neutral prefix such as subsystem_ or unk_.
  • Address-derived placeholder names are acceptable, but only as temporary rows.
  • Every renamed function should keep a short note explaining why the name is justified.
  • For high-level passes, prioritize names that clarify loop role, ownership, or handoff semantics over names that only describe a local helper's mechanics.

Confidence Rules

  • 1: address exists, purpose unknown
  • 2: rough subsystem guess only
  • 3: behavior inferred from control flow or strings
  • 4: prototype or side effects mostly understood
  • 5: confirmed by multiple sources or runtime evidence

Export Policy

Commit exports that are cheap to diff and useful to reuse:

  • JSON, CSV, TXT, and Markdown summaries
  • function maps and subsystem inventories
  • small command outputs that anchor a finding
  • raw startup discovery exports from headless Ghidra

Keep these local-only:

  • Ghidra projects and caches
  • repo-local Ghidra runtime state under .ghidra/
  • Rizin databases and ephemeral sessions
  • temporary dumps and scratch notebooks that have not been curated

Keep the ownership split explicit:

  • raw Ghidra or Rizin discovery output is derived data
  • function-map.csv is the curated ledger and may intentionally diverge from auto-generated names

Exit Criteria For The Broad-Mapping Milestone

The current breadth-first milestone is complete when the repo has:

  • a stable starter map for the canonical binary
  • a control-loop atlas covering the major top-level loops and handoff points
  • named anchors for startup, shell/UI, frame/presentation, simulation, map/load, input, save/load, and multiplayer/network flow
  • enough notes and exports that a future session can continue without rediscovery