691 lines
29 KiB
Markdown
691 lines
29 KiB
Markdown
# Debug Load Workflow
|
||
|
||
Use this when comparing:
|
||
|
||
- one successful manual load of `hh`
|
||
- one failing hook-driven auto-load attempt
|
||
|
||
The goal is to compare the real successful owner path above `0x00445ac0` against the failing hook-driven path.
|
||
|
||
## Current Findings
|
||
|
||
From the current logs:
|
||
|
||
- successful manual load now has a grounded pre-call site at `0x004390cb` with:
|
||
- `ECX = 0x02af5840`
|
||
- `[0x006cec78] = 0x02af5840`
|
||
- `[0x006cec74] = 0x01d81230`
|
||
- top-of-stack dwords:
|
||
- `arg1 = 0x01db4739`
|
||
- `arg2 = 4`
|
||
- `arg3 = 0x0022fb50`
|
||
- next dword = `0x026d7b88`
|
||
- the subsequent successful `0x00445ac0` entry still has:
|
||
- `ret = 0x004390d0`
|
||
- `arg1 = 0x01db4739`
|
||
- `arg2 = 4`
|
||
- `arg3 = 0x0022fb50`
|
||
- older failing auto-load attempts never reached `0x00445ac0`
|
||
- the earlier failing breakpoint was `0x00517cf0` with:
|
||
- `[0x006cec78] = 0`
|
||
- `[0x006cec74] = 0x01d81230`
|
||
- the staged request globals at `0x006ce9b8..0x006ce9c4` and `0x006d1270..0x006d127c` are zero on the successful manual path
|
||
|
||
That older `0x00517cf0` result is no longer the current blocker. The hook now reaches the real coordinator entry, so the remaining gap is later shell timing or re-entrancy, not request-latch shape.
|
||
|
||
The disassembly at `0x004390b0..0x004390cb` is now the strongest grounded manual-load branch:
|
||
|
||
- it writes `[0x006cec74+0x6c] = 1`
|
||
- it computes `arg1` from `([0x006cec7c] + 0x11)`
|
||
- it pushes `arg2 = 4`
|
||
- it passes `arg3 = &out_success`
|
||
- and then calls `0x00445ac0`
|
||
|
||
So any hook experiment that does not reproduce that exact shape is no longer a plausible match for the successful manual path.
|
||
|
||
## Latest Auto-Load Comparison
|
||
|
||
The newest hook-driven debugger run now reaches `0x00445ac0` directly.
|
||
|
||
At the auto-load `0x00445ac0` breakpoint:
|
||
|
||
- stack:
|
||
- `ret = 0x7650505c` inside `dinput8`
|
||
- `arg1 = 0x01db4739`
|
||
- `arg2 = 4`
|
||
- `arg3 = 0x0022fcf8`
|
||
- globals:
|
||
- `[0x006cec74] = 0x01d81230`
|
||
- `[0x006cec7c] = 0x01db4728`
|
||
- `[0x006cec78] = 0x026d7b88`
|
||
|
||
Compared to the successful manual path:
|
||
|
||
- `arg1` matches exactly: `0x01db4739`
|
||
- `arg2` matches exactly: `4`
|
||
- `[0x006cec74]` matches exactly: `0x01d81230`
|
||
- `[0x006cec7c]` still matches the same runtime-profile base used to derive `arg1`
|
||
- `[0x006cec78]` is now non-null and published before entry
|
||
|
||
So the hook is no longer missing the coordinator entry shape. The remaining question is no longer "can we reach `0x00445ac0`?" but "does the live non-debugger call return successfully and trigger the actual restore transition?"
|
||
|
||
## Latest Plain-Run Narrowing
|
||
|
||
The current non-debugger auto-load path no longer looks like the original shell-side crash at
|
||
`0x0053fea6`.
|
||
|
||
The hook-side state machine is now stable up to the handoff into `shell_transition_mode`:
|
||
|
||
- `rrt-hook: auto load shell transition entering`
|
||
- `rrt-hook: auto load shell unpublish entering`
|
||
- `rrt-hook: auto load shell unpublish entry this=0x029b3a08 object=0x026d7b88`
|
||
|
||
So the old hook-side gating and bad-call-shape problems are no longer the blocker.
|
||
|
||
The current runtime probes now push the remaining stall much later than the original old-mode
|
||
teardown inside `shell_transition_mode`:
|
||
|
||
- `shell_transition_mode` enters
|
||
- old shell-window unpublish at `0x005389c0` enters with:
|
||
- shell bundle `this = 0x029b3a08`
|
||
- old object `object = 0x026d7b88`
|
||
- the inner wrapper `0x005400c0(object)` returns
|
||
- the full `0x53fe00 -> 0x53f860` remove-node sweep over `[object+0x74]` returns and clears
|
||
`[object+0x70/+0x74]`
|
||
- `shell_unpublish` itself then returns cleanly
|
||
- the nearby mode-`2` teardown helper `0x00502720` returns
|
||
- `shell_load_screen_window_construct` `0x004ea620` returns
|
||
- the immediate shell publish through `0x00538e50` returns
|
||
- `shell_transition_mode` itself returns cleanly
|
||
|
||
At the same time, one later load-side probe still does **not** fire:
|
||
|
||
- no `shell_active_mode_run_profile_startup_and_load_dispatch` `0x00438890` entry
|
||
|
||
So the current live stall is now best read as:
|
||
|
||
- after the old-object unpublish path at `0x005389c0`
|
||
- after the inner `0x5400c0 -> 0x53fe00 -> 0x53f860` teardown sweep
|
||
- after the nearby mode-`2` teardown helper `0x00502720`
|
||
- after the mode-`4` `LoadScreen.win` constructor and immediate shell publish
|
||
- but still before any trusted runtime evidence that `0x00438890` has entered
|
||
|
||
The richer plain-run snapshots now tighten the old-object state too:
|
||
|
||
- the old object is still the expected `Setup.win` instance with vtable `0x005d1664`
|
||
- the shell bundle head and tail both point to that same object
|
||
- `[object+0x54]` and `[object+0x58]` are both null, so the outer unlink state is consistent
|
||
- `[object+0x74]` is non-null and the first two linked nodes recovered from `+0x8a` also look
|
||
structurally sane:
|
||
- first node `0x02a74470`: vtable `0x005dd870`, type `0xea72`, owner-ish field `0x02a067b8`,
|
||
next `0x02a04b38`
|
||
- second node `0x02a04b38`: vtable `0x005dd870`, type `0xea71`, owner-ish field `0x02a067b8`,
|
||
next `0x02a03e38`
|
||
|
||
So the remaining leading hypothesis is no longer "the list head is already garbage." The later
|
||
shared node vcall target `0x540910` is healthy in general and does not fire on the failing
|
||
transition path. The newer direct probes narrow it even further: the failing transition still does
|
||
not reach `0x53fe00` or `0x53f860`. That pushes the current boundary into the tiny wrapper layer
|
||
between `shell_unpublish` entry and the `0x53fe00` call, with `0x5400c0(object)` now the next
|
||
useful direct probe.
|
||
|
||
The latest plain Wine log also ends with a matching crash:
|
||
|
||
- `wine: Unhandled page fault on read access to 02E11000 at address 02E11000`
|
||
|
||
Static disassembly sharpened the remaining boundary one step further, but the newer jump-table
|
||
decode changes the interpretation materially. The startup-runtime slice
|
||
|
||
- `0x004ea710`
|
||
- `0x0053b070(0x46c40)`
|
||
- `0x004336d0`
|
||
- `0x00438890`
|
||
|
||
is not owned by mode `4`. It is owned by jump-table entry `1` at `0x483012`. Jump-table entry `4`
|
||
lands at `0x4832e5` instead and only constructs and publishes a plain `LoadScreen.win` object
|
||
through `0x004ea620` and `0x00538e50`.
|
||
|
||
So the next useful probe is no longer the mode-`4` branch’s pre-dispatch runtime-object helper,
|
||
because mode `4` does not own that startup-runtime path at all. The next useful test is the real
|
||
startup-dispatch entrypoint: `shell_transition_mode(1, 0)`.
|
||
|
||
The latest plain runs tightened that correction one more step:
|
||
|
||
- the direct `0x004336d0` runtime-reset probe still does **not** fire
|
||
- the direct `0x00438890` startup-dispatch probe still does **not** fire
|
||
- but `shell_transition_mode`, `LoadScreen.win` construction, and the immediate shell publish all
|
||
still return cleanly
|
||
|
||
That no longer means the post-construct startup slice is mysteriously skipped inside mode `4`.
|
||
Instead, it matches the corrected static decode exactly: the hook has been entering the plain
|
||
load-screen branch rather than the startup-runtime branch.
|
||
|
||
The next best runtime target is therefore no longer another allocator cut under mode `4`. It is a
|
||
direct test of `shell_transition_mode(1, 0)`, which is the jump-table arm that statically owns the
|
||
startup-runtime allocation and `0x00438890` dispatch.
|
||
|
||
## Current Pause Point
|
||
|
||
Current recorded stop point:
|
||
|
||
- the old hook-side crash and teardown corruption are resolved
|
||
- the static jump-table decode at `0x48342c` shows the hook had been entering the wrong arm
|
||
- `shell_transition_mode(4, 0)` is only the plain `LoadScreen.win` branch
|
||
- `shell_transition_mode(1, 0)` is the startup-dispatch branch that owns:
|
||
- `0x004ea710`
|
||
- `0x0053b070(0x46c40)`
|
||
- `0x004336d0`
|
||
- `0x00438890`
|
||
|
||
So the next live experiment, when this work resumes, should start from the corrected mode-`1`
|
||
transition path rather than adding more probes under mode `4`.
|
||
|
||
Two corrective notes from the allocator probe passes:
|
||
|
||
- the first allocator experiment at `0x005a125d` was not trustworthy, because that shared cdecl
|
||
body sits behind the `0x0053b070` thunk and the initial hook used the wrong entry shape and
|
||
split its first internal `call`
|
||
- the first direct thunk hook on `0x0053b070` was also not trustworthy as implemented, because a
|
||
copied relative-`jmp` thunk cannot be replayed through an ordinary trampoline
|
||
|
||
The next trustworthy allocator boundary is still the exact mode-`4`-branch thunk at `0x0053b070`,
|
||
but only with a detour that calls the original target `0x005a125d` directly instead of executing
|
||
the copied thunk bytes.
|
||
|
||
The latest filtered run exposed a more basic gating issue too: the log only reached one
|
||
`gate mask 0x7` line with `mode_id = 2`, and it never advanced into `ready gate passed`, staging,
|
||
or transition. So that run did not actually exercise the load-screen startup subchain; it mostly
|
||
recorded ordinary shell-node activity plus one late ready-state observation. The old default gate
|
||
of `30` ready polls plus `5` deferred polls was therefore too conservative for this workflow. The
|
||
next run now lowers those defaults to `1` and `0`, and adds an explicit ready-count log so the
|
||
trace should either stage immediately or show exactly how far the gate gets.
|
||
|
||
That gate adjustment worked on the next run: the hook now reaches `ready count`, stages selector
|
||
`3`, enters `shell_transition_mode`, returns from the `LoadScreen.win` construct and publish
|
||
helpers, and reports success again. But the allocator side is still unresolved:
|
||
|
||
- there is still no trusted `0x46c40` allocator hit from `0x0053b070`
|
||
- there is still no direct `0x004336d0` runtime-reset entry
|
||
- there is still no direct `0x00438890` startup-dispatch entry
|
||
|
||
So the next clean post-publish boundary is the tiny scalar setter at `0x004ea710`, which is the
|
||
last straightforward callsite in the static mode-`4` branch immediately before the `0x0053b070`
|
||
allocation.
|
||
|
||
The immediate next runtime check is even more concrete than that helper hook, though: inspect the
|
||
state that `0x004ea710` should leave behind. Right after `shell_transition_mode` returns, the hook
|
||
now logs:
|
||
|
||
- `0x006d10b0` (`LoadScreen.win` singleton)
|
||
- `[LoadScreen.win+0x78]`
|
||
- `0x006cec78`
|
||
- `[0x006cec74+0x0c]`
|
||
- `[0x006cec7c+0x01]`
|
||
|
||
If `0x004ea710` really ran on the mode-`4` branch, `[LoadScreen.win+0x78]` should no longer be
|
||
zero after transition return.
|
||
|
||
The latest run answered that question directly:
|
||
|
||
- `shell_transition_mode` still returns cleanly
|
||
- `field_active_mode_object` is still the `LoadScreen.win` singleton
|
||
- `0x006cec78` is still null
|
||
- `[LoadScreen.win+0x78]` is still `0`
|
||
- startup selector remains `3`
|
||
|
||
So the strongest current read is no longer “the helper hooks might be missing a straight-line call.”
|
||
At transition return, RT3 still looks like it is parked in the plain `LoadScreen.win` state rather
|
||
than having entered the separate runtime-object path at all. The next useful runtime cut is
|
||
therefore not deeper inside `shell_transition_mode`, but on the later active-mode service cadence:
|
||
does a subsequent service tick on the `LoadScreen.win` object populate `[+0x78]` or promote
|
||
`0x006cec78` into the startup-dispatch object on a later frame?
|
||
|
||
The next run now logs the first few shell-state service ticks after auto-load is attempted with the
|
||
same state tuple:
|
||
|
||
- `0x006cec78`
|
||
- `[0x006cec74+0x0c]`
|
||
- `0x006d10b0`
|
||
- `[LoadScreen.win+0x78]`
|
||
- startup selector
|
||
|
||
So the next question is very narrow: does that tuple stay frozen in the plain `LoadScreen.win`
|
||
shape, or does one later service tick finally promote it into the startup-runtime object path?
|
||
|
||
The latest service-tick run makes that boundary stronger still:
|
||
|
||
- the first later shell-state service ticks `count=2..8` all keep the same frozen state
|
||
- `0x006cec78` stays `0`
|
||
- `[shell_state+0x0c]` stays the `LoadScreen.win` singleton
|
||
- `[LoadScreen.win+0x78]` stays `0`
|
||
|
||
So the active-mode service pass itself is not promoting the plain load screen into the startup
|
||
runtime object during those first later frames. The next best runtime boundary is now the
|
||
`LoadScreen.win` message owner `0x004e3a80`, because that is the remaining live owner most likely
|
||
to receive the trigger that seeds page id `[this+0x78]`, allocates the `0x46c40` startup runtime,
|
||
and later publishes `0x006cec78`.
|
||
|
||
One later run did not reach that boundary at all:
|
||
|
||
- the new `0x004e3a80` hook installed successfully
|
||
- but there were no `ready count`, staging, transition, post-transition, or load-screen-message
|
||
lines anywhere in the log
|
||
- the trace only showed ordinary shell node-vcall traffic before the window was closed
|
||
|
||
So that run is best treated as "auto-load path not exercised", not as evidence that the
|
||
`LoadScreen.win` message owner stayed silent after a successful transition. The next useful runtime
|
||
check is therefore one step earlier again: add a small first-few-calls trace on
|
||
`shell_state_service_active_mode_frame` itself so we can confirm whether that detour is firing on
|
||
the run at all and what mode id and gate mask it sees before the auto-load gate would stage.
|
||
|
||
That newer service-entry trace now confirms the full cadence:
|
||
|
||
- the service detour is firing
|
||
- the gate does stage and transition on counts `1 -> 2`
|
||
- the transition returns cleanly
|
||
- later service ticks run with `mode_id = 4`
|
||
|
||
At the same time, the next two probes are now bounded as negative results on that successful path:
|
||
|
||
- the `LoadScreen.win` message hook at `0x004e3a80` stayed completely silent
|
||
- the plain post-transition state still stays frozen with:
|
||
- `0x006cec78 = 0`
|
||
- `field_active_mode_object = LoadScreen.win`
|
||
- `[LoadScreen.win+0x78] = 0`
|
||
|
||
So the next best boundary is no longer the message owner itself. It is the shell-runtime prime call
|
||
at `0x00538b60`, because `0x00482160` still takes that branch on the null-`0x006cec78` service
|
||
path before the later frame-cycle owner `0x00520620`.
|
||
|
||
The first `0x00538b60` probe run is not trustworthy yet, though:
|
||
|
||
- the hook installed
|
||
- but the log stopped immediately after the first
|
||
`shell-state service entry count=1 ... gate_mask=0x7 mode_id=2 ...`
|
||
- there were no ready-count lines, no transition lines, and no runtime-prime entry lines
|
||
|
||
So that result currently reads as "the new runtime-prime instrumentation likely interrupted the
|
||
first service pass" rather than as a real RT3 boundary shift. The next corrective step is to log
|
||
the matching shell-state service return and to trace the first few `0x00538b60` calls even before
|
||
`AUTO_LOAD_ATTEMPTED` becomes true. That will tell us whether the first service pass actually
|
||
returns and whether the runtime-prime hook is firing at all.
|
||
|
||
The static branch under `0x00482160` also adds one more caution: `0x00538b60` is conditional, not
|
||
unconditional. The service pass only enters it when the shell runtime at `0x006d401c` is live and
|
||
`[shell_state+0xa0] == 0`. So a silent `0x00538b60` probe does not yet prove the shell is frozen
|
||
before the runtime-prime call; it may simply mean the `+0xa0` gate stayed nonzero on that service
|
||
tick. The next service-entry logs therefore need to include `[shell_state+0xa0]` before we treat
|
||
runtime-prime silence as meaningful.
|
||
|
||
The newer run closes that conditional question:
|
||
|
||
- `[shell_state+0xa0]` is `0` on the first traced service call
|
||
- `0x00538b60` is therefore eligible
|
||
- the runtime-prime probe now shows it entering and returning cleanly on that same service tick
|
||
|
||
The later run closes the next owner too:
|
||
|
||
- `0x00520620` `shell_service_frame_cycle` also enters and returns cleanly on the same frozen
|
||
mode-`4` path
|
||
- the logged state matches the generic frame-service branch:
|
||
- `[+0x1c] = 0`
|
||
- `[+0x28] = 0`
|
||
- `flag_56 = 0`
|
||
- `[+0x58]` is pulsed and then cleared back to `0`
|
||
- `0x006cec78` stays `0`
|
||
|
||
The newer run closes that owner too:
|
||
|
||
- `0x0053fda0` enters and returns cleanly on the frozen mode-`4` path
|
||
- it is actively servicing the `LoadScreen.win` object itself
|
||
- the serviced object keeps `field_1d = 1`, `field_5c = 1`, and a stable child list
|
||
- the first child vcall target at `+0x18` stays `0x005595d0`
|
||
- `0x006cec78` still stays `0`
|
||
|
||
So the next live boundary is now the child-service target itself at `0x005595d0`, not the higher
|
||
object walker.
|
||
|
||
The child-service run narrows that again. The first sixteen `0x005595d0` calls under the serviced
|
||
`LoadScreen.win` object are stable, presentation-heavy child lanes:
|
||
|
||
- every child points back to the same parent through `[child+0x86] = LoadScreen.win`
|
||
- the early children have `flag_68 = 0x03`, `flag_6a = 0x03`, and return `4`
|
||
- the later siblings have `flag_68 = 0x00`, `flag_6a = 0x03`, and return `0`
|
||
- `field_b0` stays `0`
|
||
- `0x006cec78` still stays `0`
|
||
|
||
Static disassembly matches that read: `0x005595d0` is gated by `0x00558670` and then spends most
|
||
of its body in draw or overlay helpers like `0x54f710`, `0x54f9f0`, `0x54fdd0`, `0x53de00`, and
|
||
`0x552560`. So this is a presentation-side child service path, not the missing startup-runtime
|
||
promotion.
|
||
|
||
That moved the next useful runtime target back to the transition-time allocator lane, but the
|
||
later jump-table decode changes what that means. The widened `0x0053b070` window below is now
|
||
best read as evidence for the plain mode-`4` `LoadScreen.win` arm, not as evidence for the
|
||
startup-runtime arm.
|
||
|
||
The next widened allocator run immediately paid off, but in a narrower way than expected:
|
||
|
||
- the first traced transition-window allocation is `0x7c`, which matches the static pre-construct
|
||
`0x48302a -> 0x53b070` call exactly
|
||
- the following `0x111`, `0x84`, `0x3a`, and repeated `0x25` allocations all happen before
|
||
`LoadScreen.win` construct returns, so they now read as constructor-side child or control setup
|
||
- that means the allocator probe was not disproving the `0x46c40` startup-runtime slice yet; it
|
||
was simply exhausting its 16-entry log budget inside the constructor before the later
|
||
post-construct block
|
||
|
||
The corrected follow-up run with that reset is now the decisive one: after `LoadScreen.win`
|
||
construct returns, there are still no further allocator hits before publish and transition return.
|
||
That matches the corrected jump-table decode cleanly, because mode `4` does not own the
|
||
`0x46c40 -> 0x4336d0 -> 0x438890` path at all.
|
||
|
||
The first corrected thunk run also showed one practical problem: the probe became too noisy to be
|
||
useful as a boundary marker, because `0x0053b070` is used widely outside the load-screen path.
|
||
That still mattered, because it showed the hook-driven transition was taking the same `0x7c`
|
||
constructor-side allocation as the plain mode-`4` branch rather than the startup-runtime
|
||
allocation size `0x46c40`.
|
||
|
||
## Manual Owner Tail
|
||
|
||
The branch at `0x004390b0..0x004390ea` now has a grounded post-call tail too:
|
||
|
||
- `0x004390cb` calls `0x00445ac0`
|
||
- `0x004390d0` immediately calls `0x004834e0(0, 1)` on `0x006cec74`
|
||
- if `out_success != 0` or `esi != 0`, `0x004390ea` calls `0x004384d0`
|
||
- then `0x004390ef` calls `0x0053f310` on `0x00ccbb20`
|
||
- then `0x00439104` calls `0x004834e0(0, 1)` again
|
||
|
||
The successful manual breakpoint at `0x004390cb` shows `ESI = 0` and `EDI = 1`, so the manual load branch only forces the `0x004384d0` post-load pipeline when `out_success` comes back nonzero.
|
||
|
||
That makes the current hook gap narrower still: even with the correct `0x00445ac0` arguments, returning directly into `dinput8` skips RT3's own owner-tail work unless we mirror it ourselves.
|
||
|
||
## Owner Xrefs Above `0x438890`
|
||
|
||
The containing owner at `0x00438890` is now grounded as a larger `thiscall` shell owner with two stack arguments. Current xrefs found in local disassembly are:
|
||
|
||
- `0x00443b57`
|
||
- `0x00446d7f`
|
||
- `0x0046b8bc`
|
||
- `0x004830ca`
|
||
|
||
The strongest caller so far is `0x004830ca`:
|
||
|
||
- it publishes `0x006cec78 = eax`
|
||
- then calls `0x00438890` as `thiscall(active_mode, 1, 0)`
|
||
- it sits inside `shell_transition_mode`
|
||
- it is the branch that constructs `LoadScreen.win` through `0x004ea620`
|
||
- and it continues through shell-window follow-up on `0x006d401c` after the `0x00438890` call
|
||
|
||
The surrounding mode map is tighter now too:
|
||
|
||
- mode `1` = `Game.win`
|
||
- mode `2` = `Setup.win`
|
||
- mode `3` = `Video.win`
|
||
- mode `4` = `LoadScreen.win`
|
||
- mode `5` = `Multiplayer.win`
|
||
- mode `6` = `Credits.win`
|
||
- mode `7` = `Campaign.win`
|
||
|
||
That makes `0x00438890(active_mode, 1, 0)` the strongest current RT3-native entry candidate for reproducing the successful manual load branch, because it owns the internal dispatch that later reaches `0x004390cb`.
|
||
|
||
The containing shell-mode switcher ABI is tighter now too:
|
||
|
||
- `0x00482ec0` is not a one-arg mode switch
|
||
- it is a `thiscall` with two stack arguments
|
||
- the grounded world-entry load-screen call shape at `0x443adf..0x443ae3` is `(4, 0)`
|
||
- the function confirms that shape itself by reading the requested mode from `[esp+0x0c]` and
|
||
returning with `ret 8`
|
||
- the second stack argument is now best read as an old-active-mode teardown flag, because the
|
||
`0x482fc6..0x482fff` branch only runs when it is nonzero and then releases the old active-mode
|
||
object through `0x00434300`, `0x00433730`, `0x0053b080`, and finally clears `0x006cec78`
|
||
|
||
Current static xrefs also tighten the broader ownership split:
|
||
|
||
- `0x00443b57` calls `0x00438890` from the world-entry side, but with `(0, 0)` after dismissing the current shell detail panel and servicing `0x4834e0(0, 0)`
|
||
- `0x00446d7f` calls it from the saved-runtime restore side with the same `(0, 0)` shape before immediately building `.smp` bundle payloads through `0x530c80/0x531150/0x531360`
|
||
- `0x0046b8bc` calls it from the multiplayer preview family before a later `0x00445ac0` call
|
||
- `0x004830ca` calls it from the shell-side active-mode branch with the clearest `(1, 0)` setup
|
||
|
||
So the function is no longer just a guessed hook target. It is now a real shared owner above world-entry, saved-runtime restore, multiplayer preview, and shell-side active-mode startup branches.
|
||
|
||
The internal selector split inside `0x00438890` is tighter now too:
|
||
|
||
- `[0x006cec7c+0x01]` is a startup-profile selector, not the shell mode id
|
||
- selector values `1` and `7` share the tutorial lane at `0x00438f67`, which writes
|
||
`[0x006cec74+0x6c] = 2` and loads `Tutorial_2.gmp` or `Tutorial_1.gmp`
|
||
- selector `2` is a world-root initialization lane at `0x00438fbe` that allocates `0x0062c120`
|
||
when needed, runs `0x0044faf0`, and then forces the selector to `3`
|
||
- selector `4` is a setup-side world reset or regeneration lane at `0x00439038` that rebuilds
|
||
`0x0062c120` from setup globals `0x006d14cc/0x006d14d0`, then runs `0x00535100` and `0x0040b830`
|
||
- selector values `3`, `5`, and `6` collapse into the same profile-seeded file-load lane at
|
||
`0x004390b0..0x004390ea`
|
||
- selector `6` is the one variant that explicitly writes `[0x006cec74+0x6c] = 1` before the
|
||
shared file-load call
|
||
|
||
Current grounded writers now tighten those values too:
|
||
|
||
- `Campaign.win` writes selector `6` at `0x004b8a2f`
|
||
- `Multiplayer.win` writes selector `3` on one pending-status branch at `0x004f041e`
|
||
- the larger `Setup.win` dispatcher around `0x005033d0..0x00503b7b` writes selectors `2`, `3`, `4`,
|
||
and `5` on several validated launch branches
|
||
- so the shared file-load lane is now best read as one reused profile-file startup family rather
|
||
than one owner-specific manual-load path
|
||
|
||
That means the successful manual-load branch is not the whole function. It is one three-selector
|
||
subfamily inside a broader startup dispatcher that also owns tutorial and fresh-world setup lanes.
|
||
|
||
The multiplayer preview side is also tighter now:
|
||
|
||
- `0x0046b8bc` publishes `0x006cec78`
|
||
- calls `0x00438890` as `thiscall(active_mode, 0, 0)`
|
||
- clears `[0x006cec74+0x6c]`
|
||
- and only then calls `0x00445ac0(0x006ce630, [0x006ce9c0], 0)`
|
||
|
||
That makes the preview relaunch path clearly different from the manual load branch, not just a differently staged copy of it.
|
||
|
||
## Latest Headless Debugger Result
|
||
|
||
The scripted auto-load debugger run is now useful without manual interaction:
|
||
|
||
- all breakpoints were set successfully:
|
||
- `0x00438890`
|
||
- `0x004390cb`
|
||
- `0x00445ac0`
|
||
- older runs that also broke on `0x0053fea6` stopped too early on that shell-side crash site
|
||
- the default scripted compare flow now keeps only the owner-chain breakpoints above the real load lane
|
||
|
||
So the current non-interactive path is still good enough to gather repeatable crash-side state, but
|
||
on this display setup the owner-chain compare flow is also vulnerable to early X11 death:
|
||
|
||
- `XF86VidModeClientNotLocal`
|
||
- process termination before the RT3 owner breakpoints fire
|
||
|
||
That means the current plain-run hook probes are more reliable than `winedbg` for narrowing the
|
||
live stall inside `shell_transition_mode`.
|
||
|
||
The latest static pivot also means the next reverse-engineering step does not require a live run:
|
||
|
||
- compare the mode-`4` `LoadScreen.win` owner path at `0x004830ca` against the world-entry and
|
||
saved-runtime callers of `0x00438890`
|
||
- compare how the `(1, 0)` `LoadScreen.win` lane diverges from the `(0, 0)` world-entry and
|
||
saved-runtime lanes before control reaches the shared `0x004390b0` manual-load branch
|
||
- only then return to hook experiments
|
||
|
||
## Launchers
|
||
|
||
Manual debugger run:
|
||
|
||
```bash
|
||
tools/run_rt3_winedbg.sh
|
||
```
|
||
|
||
Auto-load debugger run:
|
||
|
||
```bash
|
||
tools/run_hook_auto_load_winedbg.sh hh
|
||
```
|
||
|
||
Both scripts use `/opt/wine-stable/bin/winedbg` explicitly, so they do not depend on `winedbg` being on `PATH`.
|
||
They also default to:
|
||
|
||
- their matching command file in [tools/](/home/jan/projects/rrt/tools)
|
||
- a logfile in the repo root:
|
||
- [rt3_manual_load_winedbg.log](/home/jan/projects/rrt/rt3_manual_load_winedbg.log)
|
||
- [rt3_auto_load_winedbg.log](/home/jan/projects/rrt/rt3_auto_load_winedbg.log)
|
||
|
||
To save the full interactive debugger session to a file, set `RRT_WINEDBG_LOG`:
|
||
|
||
```bash
|
||
RRT_WINEDBG_LOG=/tmp/rt3-manual-load-winedbg.log tools/run_rt3_winedbg.sh
|
||
```
|
||
|
||
or:
|
||
|
||
```bash
|
||
RRT_WINEDBG_LOG=/tmp/rt3-auto-load-winedbg.log tools/run_hook_auto_load_winedbg.sh hh
|
||
```
|
||
|
||
Those wrappers use `script`, so both the commands you type and the debugger output are captured.
|
||
|
||
`winedbg` under `/opt/wine-stable` also supports command files directly:
|
||
|
||
```bash
|
||
tools/run_rt3_winedbg.sh
|
||
```
|
||
|
||
and:
|
||
|
||
```bash
|
||
tools/run_hook_auto_load_winedbg.sh hh
|
||
```
|
||
|
||
Override either default if needed:
|
||
|
||
```bash
|
||
RRT_WINEDBG_LOG=/tmp/rt3-manual-load-winedbg.log tools/run_rt3_winedbg.sh
|
||
```
|
||
|
||
Ready-made debugger command files are also provided:
|
||
|
||
- [winedbg_manual_load_445ac0.cmd](/home/jan/projects/rrt/tools/winedbg_manual_load_445ac0.cmd)
|
||
- [winedbg_auto_load_crash.cmd](/home/jan/projects/rrt/tools/winedbg_auto_load_crash.cmd)
|
||
- [winedbg_auto_load_compare.cmd](/home/jan/projects/rrt/tools/winedbg_auto_load_compare.cmd)
|
||
|
||
The default auto-load debugger run is now crash-first. It does not set RT3 owner breakpoints.
|
||
Instead, it:
|
||
|
||
- continues immediately
|
||
- lets `winedbg` stop on the first exception
|
||
- dumps registers
|
||
- dumps the top four stack dwords
|
||
- prints a backtrace
|
||
|
||
Use that default when the hook is already known to stage and return from `shell_transition_mode`,
|
||
and the current question is the downstream crash site.
|
||
|
||
If you specifically want the earlier owner-chain compare flow, override the command file:
|
||
|
||
```bash
|
||
RRT_WINEDBG_CMD_FILE=/home/jan/projects/rrt/tools/winedbg_auto_load_compare.cmd \
|
||
tools/run_hook_auto_load_winedbg.sh hh
|
||
```
|
||
|
||
Or use the shorter wrapper:
|
||
|
||
```bash
|
||
tools/run_hook_auto_load_winedbg_compare.sh hh
|
||
```
|
||
|
||
If you do not use `RRT_WINEDBG_CMD_FILE`, you can still open those files and paste their contents into the debugger manually.
|
||
|
||
Both scripts rebuild `rrt-hook`, copy `dinput8.dll` into the Wine RT3 directory, and launch RT3 under `winedbg`.
|
||
|
||
## Successful Manual Load
|
||
|
||
1. Launch:
|
||
|
||
```bash
|
||
tools/run_rt3_winedbg.sh
|
||
```
|
||
|
||
2. The default command file now breaks on both:
|
||
- `0x004390cb` first
|
||
- `0x00445ac0` second
|
||
|
||
3. In RT3, load save `hh` manually.
|
||
|
||
4. The command file will dump:
|
||
- registers
|
||
- top-of-stack dwords
|
||
- `0x006cec74`
|
||
- `0x006cec7c`
|
||
- `0x006cec78`
|
||
- `0x006ce9b8..0x006ce9c4`
|
||
- `0x006d1270..0x006d127c`
|
||
- backtrace
|
||
|
||
Focus on:
|
||
|
||
- whether the first hit is `0x004390cb` or `0x00445ac0`
|
||
- caller address
|
||
- `ecx`
|
||
- the three stack arguments
|
||
- `0x006cec74`
|
||
- `0x006cec7c`
|
||
- `0x006cec78`
|
||
- `0x006ce9b8..0x006ce9c4`
|
||
- `0x006d1270..`
|
||
|
||
## Failing Auto-Load Run
|
||
|
||
1. Launch:
|
||
|
||
```bash
|
||
tools/run_hook_auto_load_winedbg.sh hh
|
||
```
|
||
|
||
2. The default command file now scripts a fuller non-interactive capture sequence:
|
||
- `0x00438890`
|
||
- `0x004390cb`
|
||
- `0x00445ac0`
|
||
- `0x0053fea6`
|
||
|
||
3. Let the hook run.
|
||
|
||
4. The command file will dump the same register, stack, global, and backtrace state at the first hit.
|
||
|
||
5. Compare that output directly against the successful manual run.
|
||
|
||
So the current auto debugger path is now mostly headless:
|
||
|
||
- launch `tools/run_hook_auto_load_winedbg.sh hh`
|
||
- let the scripted breakpoints run
|
||
- inspect [rt3_auto_load_winedbg.log](/home/jan/projects/rrt/rt3_auto_load_winedbg.log)
|
||
|
||
Manual typing is no longer required for the main auto-load comparison path unless we need an additional ad hoc breakpoint.
|
||
|
||
If the run still crashes and you need even earlier crash-side inspection after that, add one temporary extra breakpoint manually for:
|
||
|
||
- `0x00517cf0`
|
||
|
||
## Optional Host-Side GDB Fallback
|
||
|
||
If `winedbg` is too clumsy for repeated crashes, attach host `gdb` to the crashing Wine process after RT3 starts:
|
||
|
||
```bash
|
||
pgrep -af 'wine.*RT3.exe'
|
||
gdb -p <pid>
|
||
```
|
||
|
||
Useful commands in `gdb`:
|
||
|
||
```gdb
|
||
set pagination off
|
||
handle SIGSEGV stop print
|
||
continue
|
||
bt
|
||
info registers
|
||
x/16wx $esp
|
||
```
|
||
|
||
This is mainly for cleaner backtraces after the fault PC is already known from `winedbg`.
|