Classify engine type parser families

This commit is contained in:
Jan Petykiewicz 2026-04-21 23:05:10 -07:00
commit f3c3eb7262
4 changed files with 149 additions and 17 deletions

View file

@ -12,7 +12,8 @@ This file is the short active queue for the current runtime and reverse-engineer
- The active static parser head is now the `engine_types` semantics frontier.
The repo now has structural inspectors for `.car`, `.lco`, `.cgo`, and `.cct`, but the binary side is still only partially semantic: the checked 1.05 corpus grounds `.car` fixed strings at `0x0c / 0x48 / 0x84` plus a second fixed stem slot at `0xa2` and a side-view resource name at `0xc0`, while `.lco` carries a stable primary stem at `0x04` and only conditional companion/body slots at `0x0c` and `0x12` when the leading stem slot is padded.
The next honest static work is to keep promoting those fixed lanes into stable parser fields and decide how far `.cgo` and the remaining `EngineTypes` sidecars can be grounded without overclaiming semantics.
The checked 1.05 corpus now also splits `.car` auxiliary stems into `126` direct matches, `14` role-neutral roots, and only `5` truly distinct cases, while `.cgo` collapses into five stable scalar ladders instead of arbitrary floats.
The next honest static work is to keep promoting those fixed lanes into stable parser fields, explain the five remaining distinct auxiliary-stem cases, and decide how far the `.cgo` ladders and guarded `.lco` companion lanes can be grounded without overclaiming semantics.
Preserved checked parser detail now lives in [EngineTypes parser semantics](rehost-queue/engine-types-parser-semantics-2026-04-21.md).
Preserved checked format inventory detail now lives in [RT3 format inventory](rehost-queue/format-inventory-2026-04-21.md).

View file

@ -13,6 +13,16 @@ first `.car` / `.lco` / `.cgo` / `.cct` inspector pass landed.
- `0xc0`: side-view resource name such as `CarSideView_1.imb`
- The checked 1.05 corpus (`145` `.car` files) carries all five of those `.car` slots on every
file inspected so far.
- The checked 1.05 corpus now also grounds the `0xa2` relation split:
- `126` files: `auxiliary_stem == internal_stem`
- `14` files: `auxiliary_stem == internal_stem` without a trailing role suffix (`L` / `T`)
- `5` files: truly distinct auxiliary stems
- Those five distinct auxiliary-stem cases are narrow and specific:
- `ClassA1T -> ClassA1L`
- `CramptonT -> CramptonL`
- `WhaleT -> WhaleL`
- `classqjl -> qjclassl`
- `classqjt -> qjclasst`
- `.lco` carries one always-present primary stem at `0x04`.
- `.lco` only carries meaningful secondary slots when that leading stem slot is padded:
- `0x0c`: conditional companion stem such as `VL80T` or `Zephyr`
@ -22,6 +32,13 @@ first `.car` / `.lco` / `.cgo` / `.cct` inspector pass landed.
fixed fields unless the earlier slot is actually zero-padded.
- `.cgo` looks structurally narrow right now: the checked 1.05 corpus has `37` files, all exactly
`25` bytes long, each carrying one leading scalar lane plus an inline content stem at `0x04`.
- The `.cgo` leading scalar is no longer just a loose raw count. The checked 1.05 corpus now
collapses into five stable ladders:
- `10 -> 20 -> 40 -> 80` across `6` freight-car families
- `20 -> 40 -> 80` for `Tanker`
- `55 -> 85` for `Auto_Carrier`
- `6 -> 13 -> 27 -> 53` for `Passenger`
- `7 -> 13 -> 27 -> 53` for `Mail`
- `.cct` remains the least ambiguous sidecar: current shipped files still look like narrow one-row
text metadata.
@ -33,6 +50,7 @@ first `.car` / `.lco` / `.cgo` / `.cct` inspector pass landed.
- internal stem
- auxiliary stem slot
- side-view resource name
- auxiliary-stem relation counts across the shipped corpus
- `.lco`
- full internal stem
- conditional companion stem slot
@ -41,14 +59,15 @@ first `.car` / `.lco` / `.cgo` / `.cct` inspector pass landed.
- `.cgo`
- leading scalar lane
- content stem
- scalar ladder counts by shared cargo-car family
- `.cct`
- tokenized identifier/value row
## Remaining Static Questions
- `.car`
- what the `0xa2` auxiliary stem really represents across locomotive, tender, and freight-car
families: alias root, image key, or alternate content stem
- what the `0xa2` auxiliary stem really represents in the five remaining distinct cases:
alternate content root, paired tender/loco image root, or a narrower foreign-display alias
- whether the trailing side-view resource can be tied cleanly to `.imb` metadata without
inventing frontend semantics
- `.lco`
@ -57,8 +76,8 @@ first `.car` / `.lco` / `.cgo` / `.cct` inspector pass landed.
- how much of the early numeric lane block can be promoted from raw `u32/f32` views into stable
typed semantics without dynamic evidence
- `.cgo`
- whether the leading scalar is enough to justify a named typed field, or whether it should stay
a conservative raw scalar until more binary/code correlation exists
- whether the leading scalar ladders are enough to justify a named typed field, or whether they
should stay conservative report-only ladders until more binary/code correlation exists
## Next Static Parser Work