rrt/docs/debug-load-workflow.md

12 KiB

Debug Load Workflow

Use this when comparing:

  • one successful manual load of hh
  • one failing hook-driven auto-load attempt

The goal is to compare the real successful owner path above 0x00445ac0 against the failing hook-driven path.

Current Findings

From the current logs:

  • successful manual load now has a grounded pre-call site at 0x004390cb with:
    • ECX = 0x02af5840
    • [0x006cec78] = 0x02af5840
    • [0x006cec74] = 0x01d81230
    • top-of-stack dwords:
      • arg1 = 0x01db4739
      • arg2 = 4
      • arg3 = 0x0022fb50
      • next dword = 0x026d7b88
  • the subsequent successful 0x00445ac0 entry still has:
    • ret = 0x004390d0
    • arg1 = 0x01db4739
    • arg2 = 4
    • arg3 = 0x0022fb50
  • older failing auto-load attempts never reached 0x00445ac0
  • the earlier failing breakpoint was 0x00517cf0 with:
    • [0x006cec78] = 0
    • [0x006cec74] = 0x01d81230
  • the staged request globals at 0x006ce9b8..0x006ce9c4 and 0x006d1270..0x006d127c are zero on the successful manual path

That older 0x00517cf0 result is no longer the current blocker. The hook now reaches the real coordinator entry, so the remaining gap is later shell timing or re-entrancy, not request-latch shape.

The disassembly at 0x004390b0..0x004390cb is now the strongest grounded manual-load branch:

  • it writes [0x006cec74+0x6c] = 1
  • it computes arg1 from ([0x006cec7c] + 0x11)
  • it pushes arg2 = 4
  • it passes arg3 = &out_success
  • and then calls 0x00445ac0

So any hook experiment that does not reproduce that exact shape is no longer a plausible match for the successful manual path.

Latest Auto-Load Comparison

The newest hook-driven debugger run now reaches 0x00445ac0 directly.

At the auto-load 0x00445ac0 breakpoint:

  • stack:
    • ret = 0x7650505c inside dinput8
    • arg1 = 0x01db4739
    • arg2 = 4
    • arg3 = 0x0022fcf8
  • globals:
    • [0x006cec74] = 0x01d81230
    • [0x006cec7c] = 0x01db4728
    • [0x006cec78] = 0x026d7b88

Compared to the successful manual path:

  • arg1 matches exactly: 0x01db4739
  • arg2 matches exactly: 4
  • [0x006cec74] matches exactly: 0x01d81230
  • [0x006cec7c] still matches the same runtime-profile base used to derive arg1
  • [0x006cec78] is now non-null and published before entry

So the hook is no longer missing the coordinator entry shape. The remaining question is no longer "can we reach 0x00445ac0?" but "does the live non-debugger call return successfully and trigger the actual restore transition?"

Latest Live Crash

The latest non-debugger auto-load run now reaches:

  • rrt-hook: auto load ready gate passed
  • rrt-hook: auto load restore calling

and then crashes at:

  • 0x0053fea6

The local disassembly around 0x0053fe90 shows a shell-side list traversal over [this+0x74] that walks linked entries and calls a virtual method on each. The crash instruction at 0x0053fea6 dereferences one traversed entry:

  • mov eax, DWORD PTR [esi]

That strongly suggests the current hook is invoking the restore from the right call shape but on the wrong shell-pump turn. The active hypothesis is now timing or re-entrancy:

  • the hook detects readiness and fires restore on the same shell-pump turn
  • RT3 later re-enters shell object traversal in a phase where one list entry is still invalid

So the next experiment is to defer the actual restore by additional ready shell-pump turns instead of firing on the first ready turn.

Manual Owner Tail

The branch at 0x004390b0..0x004390ea now has a grounded post-call tail too:

  • 0x004390cb calls 0x00445ac0
  • 0x004390d0 immediately calls 0x004834e0(0, 1) on 0x006cec74
  • if out_success != 0 or esi != 0, 0x004390ea calls 0x004384d0
  • then 0x004390ef calls 0x0053f310 on 0x00ccbb20
  • then 0x00439104 calls 0x004834e0(0, 1) again

The successful manual breakpoint at 0x004390cb shows ESI = 0 and EDI = 1, so the manual load branch only forces the 0x004384d0 post-load pipeline when out_success comes back nonzero.

That makes the current hook gap narrower still: even with the correct 0x00445ac0 arguments, returning directly into dinput8 skips RT3's own owner-tail work unless we mirror it ourselves.

Owner Xrefs Above 0x438890

The containing owner at 0x00438890 is now grounded as a larger thiscall shell owner with two stack arguments. Current xrefs found in local disassembly are:

  • 0x00443b57
  • 0x00446d7f
  • 0x0046b8bc
  • 0x004830ca

The strongest caller so far is 0x004830ca:

  • it publishes 0x006cec78 = eax
  • then calls 0x00438890 as thiscall(active_mode, 1, 0)
  • it sits inside shell_transition_mode
  • it is the branch that constructs LoadScreen.win through 0x004ea620
  • and it continues through shell-window follow-up on 0x006d401c after the 0x00438890 call

The surrounding mode map is tighter now too:

  • mode 1 = Game.win
  • mode 2 = Setup.win
  • mode 3 = Video.win
  • mode 4 = LoadScreen.win
  • mode 5 = Multiplayer.win
  • mode 6 = Credits.win
  • mode 7 = Campaign.win

That makes 0x00438890(active_mode, 1, 0) the strongest current RT3-native entry candidate for reproducing the successful manual load branch, because it owns the internal dispatch that later reaches 0x004390cb.

Current static xrefs also tighten the broader ownership split:

  • 0x00443b57 calls 0x00438890 from the world-entry side, but with (0, 0) after dismissing the current shell detail panel and servicing 0x4834e0(0, 0)
  • 0x00446d7f calls it from the saved-runtime restore side with the same (0, 0) shape before immediately building .smp bundle payloads through 0x530c80/0x531150/0x531360
  • 0x0046b8bc calls it from the multiplayer preview family before a later 0x00445ac0 call
  • 0x004830ca calls it from the shell-side active-mode branch with the clearest (1, 0) setup

So the function is no longer just a guessed hook target. It is now a real shared owner above world-entry, saved-runtime restore, multiplayer preview, and shell-side active-mode startup branches.

The internal selector split inside 0x00438890 is tighter now too:

  • [0x006cec7c+0x01] is a startup-profile selector, not the shell mode id
  • selector values 1 and 7 share the tutorial lane at 0x00438f67, which writes [0x006cec74+0x6c] = 2 and loads Tutorial_2.gmp or Tutorial_1.gmp
  • selector 2 is a world-root initialization lane at 0x00438fbe that allocates 0x0062c120 when needed, runs 0x0044faf0, and then forces the selector to 3
  • selector 4 is a setup-side world reset or regeneration lane at 0x00439038 that rebuilds 0x0062c120 from setup globals 0x006d14cc/0x006d14d0, then runs 0x00535100 and 0x0040b830
  • selector values 3, 5, and 6 collapse into the same profile-seeded file-load lane at 0x004390b0..0x004390ea
  • selector 6 is the one variant that explicitly writes [0x006cec74+0x6c] = 1 before the shared file-load call

Current grounded writers now tighten those values too:

  • Campaign.win writes selector 6 at 0x004b8a2f
  • Multiplayer.win writes selector 3 on one pending-status branch at 0x004f041e
  • the larger Setup.win dispatcher around 0x005033d0..0x00503b7b writes selectors 2, 3, 4, and 5 on several validated launch branches
  • so the shared file-load lane is now best read as one reused profile-file startup family rather than one owner-specific manual-load path

That means the successful manual-load branch is not the whole function. It is one three-selector subfamily inside a broader startup dispatcher that also owns tutorial and fresh-world setup lanes.

The multiplayer preview side is also tighter now:

  • 0x0046b8bc publishes 0x006cec78
  • calls 0x00438890 as thiscall(active_mode, 0, 0)
  • clears [0x006cec74+0x6c]
  • and only then calls 0x00445ac0(0x006ce630, [0x006ce9c0], 0)

That makes the preview relaunch path clearly different from the manual load branch, not just a differently staged copy of it.

Latest Headless Debugger Result

The scripted auto-load debugger run is now useful without manual interaction:

  • all breakpoints were set successfully:
    • 0x00438890
    • 0x004390cb
    • 0x00445ac0
    • 0x0053fea6
  • but only 0x0053fea6 actually fired in the captured run

So the current non-interactive path is good enough to gather repeatable crash-side state, but it also tells us that the current auto-load code path is still not obviously traversing the larger-owner breakpoints under winedbg. The next step is therefore more hook-side logging around the 0x00438890 call itself rather than more manual debugger work.

The latest static pivot also means the next reverse-engineering step does not require a live run:

  • compare the mode-4 LoadScreen.win owner path at 0x004830ca against the world-entry and saved-runtime callers of 0x00438890
  • compare how the (1, 0) LoadScreen.win lane diverges from the (0, 0) world-entry and saved-runtime lanes before control reaches the shared 0x004390b0 manual-load branch
  • only then return to hook experiments

Launchers

Manual debugger run:

tools/run_rt3_winedbg.sh

Auto-load debugger run:

tools/run_hook_auto_load_winedbg.sh hh

Both scripts use /opt/wine-stable/bin/winedbg explicitly, so they do not depend on winedbg being on PATH. They also default to:

To save the full interactive debugger session to a file, set RRT_WINEDBG_LOG:

RRT_WINEDBG_LOG=/tmp/rt3-manual-load-winedbg.log tools/run_rt3_winedbg.sh

or:

RRT_WINEDBG_LOG=/tmp/rt3-auto-load-winedbg.log tools/run_hook_auto_load_winedbg.sh hh

Those wrappers use script, so both the commands you type and the debugger output are captured.

winedbg under /opt/wine-stable also supports command files directly:

tools/run_rt3_winedbg.sh

and:

tools/run_hook_auto_load_winedbg.sh hh

Override either default if needed:

RRT_WINEDBG_LOG=/tmp/rt3-manual-load-winedbg.log tools/run_rt3_winedbg.sh

Ready-made debugger command files are also provided:

If you do not use RRT_WINEDBG_CMD_FILE, you can still open those files and paste their contents into the debugger manually.

Both scripts rebuild rrt-hook, copy dinput8.dll into the Wine RT3 directory, and launch RT3 under winedbg.

Successful Manual Load

  1. Launch:
tools/run_rt3_winedbg.sh
  1. The default command file now breaks on both:

    • 0x004390cb first
    • 0x00445ac0 second
  2. In RT3, load save hh manually.

  3. The command file will dump:

    • registers
    • top-of-stack dwords
    • 0x006cec74
    • 0x006cec7c
    • 0x006cec78
    • 0x006ce9b8..0x006ce9c4
    • 0x006d1270..0x006d127c
    • backtrace

Focus on:

  • whether the first hit is 0x004390cb or 0x00445ac0
  • caller address
  • ecx
  • the three stack arguments
  • 0x006cec74
  • 0x006cec7c
  • 0x006cec78
  • 0x006ce9b8..0x006ce9c4
  • 0x006d1270..

Failing Auto-Load Run

  1. Launch:
tools/run_hook_auto_load_winedbg.sh hh
  1. The default command file now scripts a fuller non-interactive capture sequence:

    • 0x00438890
    • 0x004390cb
    • 0x00445ac0
    • 0x0053fea6
  2. Let the hook run.

  3. The command file will dump the same register, stack, global, and backtrace state at the first hit.

  4. Compare that output directly against the successful manual run.

So the current auto debugger path is now mostly headless:

Manual typing is no longer required for the main auto-load comparison path unless we need an additional ad hoc breakpoint.

If the run still crashes and you need even earlier crash-side inspection after that, add one temporary extra breakpoint manually for:

  • 0x00517cf0

Optional Host-Side GDB Fallback

If winedbg is too clumsy for repeated crashes, attach host gdb to the crashing Wine process after RT3 starts:

pgrep -af 'wine.*RT3.exe'
gdb -p <pid>

Useful commands in gdb:

set pagination off
handle SIGSEGV stop print
continue
bt
info registers
x/16wx $esp

This is mainly for cleaner backtraces after the fault PC is already known from winedbg.