rrt/docs/debug-load-workflow.md

29 KiB
Raw Blame History

Debug Load Workflow

Use this when comparing:

  • one successful manual load of hh
  • one failing hook-driven auto-load attempt

The goal is to compare the real successful owner path above 0x00445ac0 against the failing hook-driven path.

Current Findings

From the current logs:

  • successful manual load now has a grounded pre-call site at 0x004390cb with:
    • ECX = 0x02af5840
    • [0x006cec78] = 0x02af5840
    • [0x006cec74] = 0x01d81230
    • top-of-stack dwords:
      • arg1 = 0x01db4739
      • arg2 = 4
      • arg3 = 0x0022fb50
      • next dword = 0x026d7b88
  • the subsequent successful 0x00445ac0 entry still has:
    • ret = 0x004390d0
    • arg1 = 0x01db4739
    • arg2 = 4
    • arg3 = 0x0022fb50
  • older failing auto-load attempts never reached 0x00445ac0
  • the earlier failing breakpoint was 0x00517cf0 with:
    • [0x006cec78] = 0
    • [0x006cec74] = 0x01d81230
  • the staged request globals at 0x006ce9b8..0x006ce9c4 and 0x006d1270..0x006d127c are zero on the successful manual path

That older 0x00517cf0 result is no longer the current blocker. The hook now reaches the real coordinator entry, so the remaining gap is later shell timing or re-entrancy, not request-latch shape.

The disassembly at 0x004390b0..0x004390cb is now the strongest grounded manual-load branch:

  • it writes [0x006cec74+0x6c] = 1
  • it computes arg1 from ([0x006cec7c] + 0x11)
  • it pushes arg2 = 4
  • it passes arg3 = &out_success
  • and then calls 0x00445ac0

So any hook experiment that does not reproduce that exact shape is no longer a plausible match for the successful manual path.

Latest Auto-Load Comparison

The newest hook-driven debugger run now reaches 0x00445ac0 directly.

At the auto-load 0x00445ac0 breakpoint:

  • stack:
    • ret = 0x7650505c inside dinput8
    • arg1 = 0x01db4739
    • arg2 = 4
    • arg3 = 0x0022fcf8
  • globals:
    • [0x006cec74] = 0x01d81230
    • [0x006cec7c] = 0x01db4728
    • [0x006cec78] = 0x026d7b88

Compared to the successful manual path:

  • arg1 matches exactly: 0x01db4739
  • arg2 matches exactly: 4
  • [0x006cec74] matches exactly: 0x01d81230
  • [0x006cec7c] still matches the same runtime-profile base used to derive arg1
  • [0x006cec78] is now non-null and published before entry

So the hook is no longer missing the coordinator entry shape. The remaining question is no longer "can we reach 0x00445ac0?" but "does the live non-debugger call return successfully and trigger the actual restore transition?"

Latest Plain-Run Narrowing

The current non-debugger auto-load path no longer looks like the original shell-side crash at 0x0053fea6.

The hook-side state machine is now stable up to the handoff into shell_transition_mode:

  • rrt-hook: auto load shell transition entering
  • rrt-hook: auto load shell unpublish entering
  • rrt-hook: auto load shell unpublish entry this=0x029b3a08 object=0x026d7b88

So the old hook-side gating and bad-call-shape problems are no longer the blocker.

The current runtime probes now push the remaining stall much later than the original old-mode teardown inside shell_transition_mode:

  • shell_transition_mode enters
  • old shell-window unpublish at 0x005389c0 enters with:
    • shell bundle this = 0x029b3a08
    • old object object = 0x026d7b88
  • the inner wrapper 0x005400c0(object) returns
  • the full 0x53fe00 -> 0x53f860 remove-node sweep over [object+0x74] returns and clears [object+0x70/+0x74]
  • shell_unpublish itself then returns cleanly
  • the nearby mode-2 teardown helper 0x00502720 returns
  • shell_load_screen_window_construct 0x004ea620 returns
  • the immediate shell publish through 0x00538e50 returns
  • shell_transition_mode itself returns cleanly

At the same time, one later load-side probe still does not fire:

  • no shell_active_mode_run_profile_startup_and_load_dispatch 0x00438890 entry

So the current live stall is now best read as:

  • after the old-object unpublish path at 0x005389c0
  • after the inner 0x5400c0 -> 0x53fe00 -> 0x53f860 teardown sweep
  • after the nearby mode-2 teardown helper 0x00502720
  • after the mode-4 LoadScreen.win constructor and immediate shell publish
  • but still before any trusted runtime evidence that 0x00438890 has entered

The richer plain-run snapshots now tighten the old-object state too:

  • the old object is still the expected Setup.win instance with vtable 0x005d1664
  • the shell bundle head and tail both point to that same object
  • [object+0x54] and [object+0x58] are both null, so the outer unlink state is consistent
  • [object+0x74] is non-null and the first two linked nodes recovered from +0x8a also look structurally sane:
    • first node 0x02a74470: vtable 0x005dd870, type 0xea72, owner-ish field 0x02a067b8, next 0x02a04b38
    • second node 0x02a04b38: vtable 0x005dd870, type 0xea71, owner-ish field 0x02a067b8, next 0x02a03e38

So the remaining leading hypothesis is no longer "the list head is already garbage." The later shared node vcall target 0x540910 is healthy in general and does not fire on the failing transition path. The newer direct probes narrow it even further: the failing transition still does not reach 0x53fe00 or 0x53f860. That pushes the current boundary into the tiny wrapper layer between shell_unpublish entry and the 0x53fe00 call, with 0x5400c0(object) now the next useful direct probe.

The latest plain Wine log also ends with a matching crash:

  • wine: Unhandled page fault on read access to 02E11000 at address 02E11000

Static disassembly sharpened the remaining boundary one step further, but the newer jump-table decode changes the interpretation materially. The startup-runtime slice

  • 0x004ea710
  • 0x0053b070(0x46c40)
  • 0x004336d0
  • 0x00438890

is not owned by mode 4. It is owned by jump-table entry 1 at 0x483012. Jump-table entry 4 lands at 0x4832e5 instead and only constructs and publishes a plain LoadScreen.win object through 0x004ea620 and 0x00538e50.

So the next useful probe is no longer the mode-4 branchs pre-dispatch runtime-object helper, because mode 4 does not own that startup-runtime path at all. The next useful test is the real startup-dispatch entrypoint: shell_transition_mode(1, 0).

The latest plain runs tightened that correction one more step:

  • the direct 0x004336d0 runtime-reset probe still does not fire
  • the direct 0x00438890 startup-dispatch probe still does not fire
  • but shell_transition_mode, LoadScreen.win construction, and the immediate shell publish all still return cleanly

That no longer means the post-construct startup slice is mysteriously skipped inside mode 4. Instead, it matches the corrected static decode exactly: the hook has been entering the plain load-screen branch rather than the startup-runtime branch.

The next best runtime target is therefore no longer another allocator cut under mode 4. It is a direct test of shell_transition_mode(1, 0), which is the jump-table arm that statically owns the startup-runtime allocation and 0x00438890 dispatch.

Current Pause Point

Current recorded stop point:

  • the old hook-side crash and teardown corruption are resolved
  • the static jump-table decode at 0x48342c shows the hook had been entering the wrong arm
  • shell_transition_mode(4, 0) is only the plain LoadScreen.win branch
  • shell_transition_mode(1, 0) is the startup-dispatch branch that owns:
    • 0x004ea710
    • 0x0053b070(0x46c40)
    • 0x004336d0
    • 0x00438890

So the next live experiment, when this work resumes, should start from the corrected mode-1 transition path rather than adding more probes under mode 4.

Two corrective notes from the allocator probe passes:

  • the first allocator experiment at 0x005a125d was not trustworthy, because that shared cdecl body sits behind the 0x0053b070 thunk and the initial hook used the wrong entry shape and split its first internal call
  • the first direct thunk hook on 0x0053b070 was also not trustworthy as implemented, because a copied relative-jmp thunk cannot be replayed through an ordinary trampoline

The next trustworthy allocator boundary is still the exact mode-4-branch thunk at 0x0053b070, but only with a detour that calls the original target 0x005a125d directly instead of executing the copied thunk bytes.

The latest filtered run exposed a more basic gating issue too: the log only reached one gate mask 0x7 line with mode_id = 2, and it never advanced into ready gate passed, staging, or transition. So that run did not actually exercise the load-screen startup subchain; it mostly recorded ordinary shell-node activity plus one late ready-state observation. The old default gate of 30 ready polls plus 5 deferred polls was therefore too conservative for this workflow. The next run now lowers those defaults to 1 and 0, and adds an explicit ready-count log so the trace should either stage immediately or show exactly how far the gate gets.

That gate adjustment worked on the next run: the hook now reaches ready count, stages selector 3, enters shell_transition_mode, returns from the LoadScreen.win construct and publish helpers, and reports success again. But the allocator side is still unresolved:

  • there is still no trusted 0x46c40 allocator hit from 0x0053b070
  • there is still no direct 0x004336d0 runtime-reset entry
  • there is still no direct 0x00438890 startup-dispatch entry

So the next clean post-publish boundary is the tiny scalar setter at 0x004ea710, which is the last straightforward callsite in the static mode-4 branch immediately before the 0x0053b070 allocation.

The immediate next runtime check is even more concrete than that helper hook, though: inspect the state that 0x004ea710 should leave behind. Right after shell_transition_mode returns, the hook now logs:

  • 0x006d10b0 (LoadScreen.win singleton)
  • [LoadScreen.win+0x78]
  • 0x006cec78
  • [0x006cec74+0x0c]
  • [0x006cec7c+0x01]

If 0x004ea710 really ran on the mode-4 branch, [LoadScreen.win+0x78] should no longer be zero after transition return.

The latest run answered that question directly:

  • shell_transition_mode still returns cleanly
  • field_active_mode_object is still the LoadScreen.win singleton
  • 0x006cec78 is still null
  • [LoadScreen.win+0x78] is still 0
  • startup selector remains 3

So the strongest current read is no longer “the helper hooks might be missing a straight-line call.” At transition return, RT3 still looks like it is parked in the plain LoadScreen.win state rather than having entered the separate runtime-object path at all. The next useful runtime cut is therefore not deeper inside shell_transition_mode, but on the later active-mode service cadence: does a subsequent service tick on the LoadScreen.win object populate [+0x78] or promote 0x006cec78 into the startup-dispatch object on a later frame?

The next run now logs the first few shell-state service ticks after auto-load is attempted with the same state tuple:

  • 0x006cec78
  • [0x006cec74+0x0c]
  • 0x006d10b0
  • [LoadScreen.win+0x78]
  • startup selector

So the next question is very narrow: does that tuple stay frozen in the plain LoadScreen.win shape, or does one later service tick finally promote it into the startup-runtime object path?

The latest service-tick run makes that boundary stronger still:

  • the first later shell-state service ticks count=2..8 all keep the same frozen state
  • 0x006cec78 stays 0
  • [shell_state+0x0c] stays the LoadScreen.win singleton
  • [LoadScreen.win+0x78] stays 0

So the active-mode service pass itself is not promoting the plain load screen into the startup runtime object during those first later frames. The next best runtime boundary is now the LoadScreen.win message owner 0x004e3a80, because that is the remaining live owner most likely to receive the trigger that seeds page id [this+0x78], allocates the 0x46c40 startup runtime, and later publishes 0x006cec78.

One later run did not reach that boundary at all:

  • the new 0x004e3a80 hook installed successfully
  • but there were no ready count, staging, transition, post-transition, or load-screen-message lines anywhere in the log
  • the trace only showed ordinary shell node-vcall traffic before the window was closed

So that run is best treated as "auto-load path not exercised", not as evidence that the LoadScreen.win message owner stayed silent after a successful transition. The next useful runtime check is therefore one step earlier again: add a small first-few-calls trace on shell_state_service_active_mode_frame itself so we can confirm whether that detour is firing on the run at all and what mode id and gate mask it sees before the auto-load gate would stage.

That newer service-entry trace now confirms the full cadence:

  • the service detour is firing
  • the gate does stage and transition on counts 1 -> 2
  • the transition returns cleanly
  • later service ticks run with mode_id = 4

At the same time, the next two probes are now bounded as negative results on that successful path:

  • the LoadScreen.win message hook at 0x004e3a80 stayed completely silent
  • the plain post-transition state still stays frozen with:
    • 0x006cec78 = 0
    • field_active_mode_object = LoadScreen.win
    • [LoadScreen.win+0x78] = 0

So the next best boundary is no longer the message owner itself. It is the shell-runtime prime call at 0x00538b60, because 0x00482160 still takes that branch on the null-0x006cec78 service path before the later frame-cycle owner 0x00520620.

The first 0x00538b60 probe run is not trustworthy yet, though:

  • the hook installed
  • but the log stopped immediately after the first shell-state service entry count=1 ... gate_mask=0x7 mode_id=2 ...
  • there were no ready-count lines, no transition lines, and no runtime-prime entry lines

So that result currently reads as "the new runtime-prime instrumentation likely interrupted the first service pass" rather than as a real RT3 boundary shift. The next corrective step is to log the matching shell-state service return and to trace the first few 0x00538b60 calls even before AUTO_LOAD_ATTEMPTED becomes true. That will tell us whether the first service pass actually returns and whether the runtime-prime hook is firing at all.

The static branch under 0x00482160 also adds one more caution: 0x00538b60 is conditional, not unconditional. The service pass only enters it when the shell runtime at 0x006d401c is live and [shell_state+0xa0] == 0. So a silent 0x00538b60 probe does not yet prove the shell is frozen before the runtime-prime call; it may simply mean the +0xa0 gate stayed nonzero on that service tick. The next service-entry logs therefore need to include [shell_state+0xa0] before we treat runtime-prime silence as meaningful.

The newer run closes that conditional question:

  • [shell_state+0xa0] is 0 on the first traced service call
  • 0x00538b60 is therefore eligible
  • the runtime-prime probe now shows it entering and returning cleanly on that same service tick

The later run closes the next owner too:

  • 0x00520620 shell_service_frame_cycle also enters and returns cleanly on the same frozen mode-4 path
  • the logged state matches the generic frame-service branch:
    • [+0x1c] = 0
    • [+0x28] = 0
    • flag_56 = 0
    • [+0x58] is pulsed and then cleared back to 0
    • 0x006cec78 stays 0

The newer run closes that owner too:

  • 0x0053fda0 enters and returns cleanly on the frozen mode-4 path
  • it is actively servicing the LoadScreen.win object itself
  • the serviced object keeps field_1d = 1, field_5c = 1, and a stable child list
  • the first child vcall target at +0x18 stays 0x005595d0
  • 0x006cec78 still stays 0

So the next live boundary is now the child-service target itself at 0x005595d0, not the higher object walker.

The child-service run narrows that again. The first sixteen 0x005595d0 calls under the serviced LoadScreen.win object are stable, presentation-heavy child lanes:

  • every child points back to the same parent through [child+0x86] = LoadScreen.win
  • the early children have flag_68 = 0x03, flag_6a = 0x03, and return 4
  • the later siblings have flag_68 = 0x00, flag_6a = 0x03, and return 0
  • field_b0 stays 0
  • 0x006cec78 still stays 0

Static disassembly matches that read: 0x005595d0 is gated by 0x00558670 and then spends most of its body in draw or overlay helpers like 0x54f710, 0x54f9f0, 0x54fdd0, 0x53de00, and 0x552560. So this is a presentation-side child service path, not the missing startup-runtime promotion.

That moved the next useful runtime target back to the transition-time allocator lane, but the later jump-table decode changes what that means. The widened 0x0053b070 window below is now best read as evidence for the plain mode-4 LoadScreen.win arm, not as evidence for the startup-runtime arm.

The next widened allocator run immediately paid off, but in a narrower way than expected:

  • the first traced transition-window allocation is 0x7c, which matches the static pre-construct 0x48302a -> 0x53b070 call exactly
  • the following 0x111, 0x84, 0x3a, and repeated 0x25 allocations all happen before LoadScreen.win construct returns, so they now read as constructor-side child or control setup
  • that means the allocator probe was not disproving the 0x46c40 startup-runtime slice yet; it was simply exhausting its 16-entry log budget inside the constructor before the later post-construct block

The corrected follow-up run with that reset is now the decisive one: after LoadScreen.win construct returns, there are still no further allocator hits before publish and transition return. That matches the corrected jump-table decode cleanly, because mode 4 does not own the 0x46c40 -> 0x4336d0 -> 0x438890 path at all.

The first corrected thunk run also showed one practical problem: the probe became too noisy to be useful as a boundary marker, because 0x0053b070 is used widely outside the load-screen path. That still mattered, because it showed the hook-driven transition was taking the same 0x7c constructor-side allocation as the plain mode-4 branch rather than the startup-runtime allocation size 0x46c40.

Manual Owner Tail

The branch at 0x004390b0..0x004390ea now has a grounded post-call tail too:

  • 0x004390cb calls 0x00445ac0
  • 0x004390d0 immediately calls 0x004834e0(0, 1) on 0x006cec74
  • if out_success != 0 or esi != 0, 0x004390ea calls 0x004384d0
  • then 0x004390ef calls 0x0053f310 on 0x00ccbb20
  • then 0x00439104 calls 0x004834e0(0, 1) again

The successful manual breakpoint at 0x004390cb shows ESI = 0 and EDI = 1, so the manual load branch only forces the 0x004384d0 post-load pipeline when out_success comes back nonzero.

That makes the current hook gap narrower still: even with the correct 0x00445ac0 arguments, returning directly into dinput8 skips RT3's own owner-tail work unless we mirror it ourselves.

Owner Xrefs Above 0x438890

The containing owner at 0x00438890 is now grounded as a larger thiscall shell owner with two stack arguments. Current xrefs found in local disassembly are:

  • 0x00443b57
  • 0x00446d7f
  • 0x0046b8bc
  • 0x004830ca

The strongest caller so far is 0x004830ca:

  • it publishes 0x006cec78 = eax
  • then calls 0x00438890 as thiscall(active_mode, 1, 0)
  • it sits inside shell_transition_mode
  • it is the branch that constructs LoadScreen.win through 0x004ea620
  • and it continues through shell-window follow-up on 0x006d401c after the 0x00438890 call

The surrounding mode map is tighter now too:

  • mode 1 = Game.win
  • mode 2 = Setup.win
  • mode 3 = Video.win
  • mode 4 = LoadScreen.win
  • mode 5 = Multiplayer.win
  • mode 6 = Credits.win
  • mode 7 = Campaign.win

That makes 0x00438890(active_mode, 1, 0) the strongest current RT3-native entry candidate for reproducing the successful manual load branch, because it owns the internal dispatch that later reaches 0x004390cb.

The containing shell-mode switcher ABI is tighter now too:

  • 0x00482ec0 is not a one-arg mode switch
  • it is a thiscall with two stack arguments
  • the grounded world-entry load-screen call shape at 0x443adf..0x443ae3 is (4, 0)
  • the function confirms that shape itself by reading the requested mode from [esp+0x0c] and returning with ret 8
  • the second stack argument is now best read as an old-active-mode teardown flag, because the 0x482fc6..0x482fff branch only runs when it is nonzero and then releases the old active-mode object through 0x00434300, 0x00433730, 0x0053b080, and finally clears 0x006cec78

Current static xrefs also tighten the broader ownership split:

  • 0x00443b57 calls 0x00438890 from the world-entry side, but with (0, 0) after dismissing the current shell detail panel and servicing 0x4834e0(0, 0)
  • 0x00446d7f calls it from the saved-runtime restore side with the same (0, 0) shape before immediately building .smp bundle payloads through 0x530c80/0x531150/0x531360
  • 0x0046b8bc calls it from the multiplayer preview family before a later 0x00445ac0 call
  • 0x004830ca calls it from the shell-side active-mode branch with the clearest (1, 0) setup

So the function is no longer just a guessed hook target. It is now a real shared owner above world-entry, saved-runtime restore, multiplayer preview, and shell-side active-mode startup branches.

The internal selector split inside 0x00438890 is tighter now too:

  • [0x006cec7c+0x01] is a startup-profile selector, not the shell mode id
  • selector values 1 and 7 share the tutorial lane at 0x00438f67, which writes [0x006cec74+0x6c] = 2 and loads Tutorial_2.gmp or Tutorial_1.gmp
  • selector 2 is a world-root initialization lane at 0x00438fbe that allocates 0x0062c120 when needed, runs 0x0044faf0, and then forces the selector to 3
  • selector 4 is a setup-side world reset or regeneration lane at 0x00439038 that rebuilds 0x0062c120 from setup globals 0x006d14cc/0x006d14d0, then runs 0x00535100 and 0x0040b830
  • selector values 3, 5, and 6 collapse into the same profile-seeded file-load lane at 0x004390b0..0x004390ea
  • selector 6 is the one variant that explicitly writes [0x006cec74+0x6c] = 1 before the shared file-load call

Current grounded writers now tighten those values too:

  • Campaign.win writes selector 6 at 0x004b8a2f
  • Multiplayer.win writes selector 3 on one pending-status branch at 0x004f041e
  • the larger Setup.win dispatcher around 0x005033d0..0x00503b7b writes selectors 2, 3, 4, and 5 on several validated launch branches
  • so the shared file-load lane is now best read as one reused profile-file startup family rather than one owner-specific manual-load path

That means the successful manual-load branch is not the whole function. It is one three-selector subfamily inside a broader startup dispatcher that also owns tutorial and fresh-world setup lanes.

The multiplayer preview side is also tighter now:

  • 0x0046b8bc publishes 0x006cec78
  • calls 0x00438890 as thiscall(active_mode, 0, 0)
  • clears [0x006cec74+0x6c]
  • and only then calls 0x00445ac0(0x006ce630, [0x006ce9c0], 0)

That makes the preview relaunch path clearly different from the manual load branch, not just a differently staged copy of it.

Latest Headless Debugger Result

The scripted auto-load debugger run is now useful without manual interaction:

  • all breakpoints were set successfully:
    • 0x00438890
    • 0x004390cb
    • 0x00445ac0
  • older runs that also broke on 0x0053fea6 stopped too early on that shell-side crash site
  • the default scripted compare flow now keeps only the owner-chain breakpoints above the real load lane

So the current non-interactive path is still good enough to gather repeatable crash-side state, but on this display setup the owner-chain compare flow is also vulnerable to early X11 death:

  • XF86VidModeClientNotLocal
  • process termination before the RT3 owner breakpoints fire

That means the current plain-run hook probes are more reliable than winedbg for narrowing the live stall inside shell_transition_mode.

The latest static pivot also means the next reverse-engineering step does not require a live run:

  • compare the mode-4 LoadScreen.win owner path at 0x004830ca against the world-entry and saved-runtime callers of 0x00438890
  • compare how the (1, 0) LoadScreen.win lane diverges from the (0, 0) world-entry and saved-runtime lanes before control reaches the shared 0x004390b0 manual-load branch
  • only then return to hook experiments

Launchers

Manual debugger run:

tools/run_rt3_winedbg.sh

Auto-load debugger run:

tools/run_hook_auto_load_winedbg.sh hh

Both scripts use /opt/wine-stable/bin/winedbg explicitly, so they do not depend on winedbg being on PATH. They also default to:

To save the full interactive debugger session to a file, set RRT_WINEDBG_LOG:

RRT_WINEDBG_LOG=/tmp/rt3-manual-load-winedbg.log tools/run_rt3_winedbg.sh

or:

RRT_WINEDBG_LOG=/tmp/rt3-auto-load-winedbg.log tools/run_hook_auto_load_winedbg.sh hh

Those wrappers use script, so both the commands you type and the debugger output are captured.

winedbg under /opt/wine-stable also supports command files directly:

tools/run_rt3_winedbg.sh

and:

tools/run_hook_auto_load_winedbg.sh hh

Override either default if needed:

RRT_WINEDBG_LOG=/tmp/rt3-manual-load-winedbg.log tools/run_rt3_winedbg.sh

Ready-made debugger command files are also provided:

The default auto-load debugger run is now crash-first. It does not set RT3 owner breakpoints. Instead, it:

  • continues immediately
  • lets winedbg stop on the first exception
  • dumps registers
  • dumps the top four stack dwords
  • prints a backtrace

Use that default when the hook is already known to stage and return from shell_transition_mode, and the current question is the downstream crash site.

If you specifically want the earlier owner-chain compare flow, override the command file:

RRT_WINEDBG_CMD_FILE=/home/jan/projects/rrt/tools/winedbg_auto_load_compare.cmd \
tools/run_hook_auto_load_winedbg.sh hh

Or use the shorter wrapper:

tools/run_hook_auto_load_winedbg_compare.sh hh

If you do not use RRT_WINEDBG_CMD_FILE, you can still open those files and paste their contents into the debugger manually.

Both scripts rebuild rrt-hook, copy dinput8.dll into the Wine RT3 directory, and launch RT3 under winedbg.

Successful Manual Load

  1. Launch:
tools/run_rt3_winedbg.sh
  1. The default command file now breaks on both:

    • 0x004390cb first
    • 0x00445ac0 second
  2. In RT3, load save hh manually.

  3. The command file will dump:

    • registers
    • top-of-stack dwords
    • 0x006cec74
    • 0x006cec7c
    • 0x006cec78
    • 0x006ce9b8..0x006ce9c4
    • 0x006d1270..0x006d127c
    • backtrace

Focus on:

  • whether the first hit is 0x004390cb or 0x00445ac0
  • caller address
  • ecx
  • the three stack arguments
  • 0x006cec74
  • 0x006cec7c
  • 0x006cec78
  • 0x006ce9b8..0x006ce9c4
  • 0x006d1270..

Failing Auto-Load Run

  1. Launch:
tools/run_hook_auto_load_winedbg.sh hh
  1. The default command file now scripts a fuller non-interactive capture sequence:

    • 0x00438890
    • 0x004390cb
    • 0x00445ac0
    • 0x0053fea6
  2. Let the hook run.

  3. The command file will dump the same register, stack, global, and backtrace state at the first hit.

  4. Compare that output directly against the successful manual run.

So the current auto debugger path is now mostly headless:

Manual typing is no longer required for the main auto-load comparison path unless we need an additional ad hoc breakpoint.

If the run still crashes and you need even earlier crash-side inspection after that, add one temporary extra breakpoint manually for:

  • 0x00517cf0

Optional Host-Side GDB Fallback

If winedbg is too clumsy for repeated crashes, attach host gdb to the crashing Wine process after RT3 starts:

pgrep -af 'wine.*RT3.exe'
gdb -p <pid>

Useful commands in gdb:

set pagination off
handle SIGSEGV stop print
continue
bt
info registers
x/16wx $esp

This is mainly for cleaner backtraces after the fault PC is already known from winedbg.