Six layers of silent failure behind one smoke test

2026-04-12 Sunday

Severity: high — complete silent failure of the French ledger’s OFX import, tax-table seeding, and T2125 individual chart, with no error surfaced to the user. A single bilingual-onboarding smoke test failed for the French path; what looked like one bug turned out to be six, each hiding the next.

Summary

A single smoke test run for bilingual onboarding failed for the French ledger path. What appeared to be one bug turned into six distinct issues, each hiding the next. Every failure was silent — the worker returned 200, the UI showed success, and nothing in the logs pointed to a problem. The user would have seen empty tax tables and an empty transactions page with no error message.

The investigation required UAT-level testing to surface failures that should have been caught by unit tests or CI. This is a record of what broke, why it was invisible, and what was built to prevent the same class of failures from recurring.

Timeline

Event	Finding
FR smoke test fails	OFX import returns 200, UI shows “X transactions imported”, transactions page is empty
Add a `gnucash_unchanged` guard	The container was returning the original file unchanged — import silently produced no changes
Investigate why	The import batch contains English account paths (`Assets:Current assets:...`) in a FR ledger; those accounts don’t exist in the FR chart
Fix offset account paths	A constant hardcoded the English GIFI 1002 name instead of the IGRF French name
FR chart regenerated	The pre-built template was built from an old generator that emitted English fallback names — every FR ledger had English accounts
Tax tables page checked	500 error: account `Passif:...:Impôts à payer:TPS perçue` not found
Fix tax-table paths	The FR chart uses GIFI 2680’s official name `Taxes et impôts à payer`; the defaults file used `Impôts à payer` (wrong)
Check T2125 individual chart	French names in the source CSV were AI-translated, not from the official CRA T4002 guide
Fix T4002 FR names	The CSV builder was rewritten to fetch official names from CRA HTML; the individual FR chart was regenerated
Root cause of the tooling gap	No script to rebuild the chart templates from source; no tests to catch staleness or path mismatches
Build a rebuild script + a static path test	Systematic rebuild + two-layer static validation

What was broken (six issues)

1. The pre-built FR template was stale (silent import failure)

The pre-built GnuCash template for FR corporation ledgers was generated from an old version of the chart generator that fell back to English account names when French translations were missing. Every FR ledger created since bilingual onboarding shipped contained English accounts.

When OFX import sent a batch with correct French paths (e.g., Actif:Actif à court terme:...), the container couldn’t find those accounts, returned the original file unchanged, and the worker logged it as success (gnucash_written: txCount).

Fix: regenerate all four templates via a single rebuild script.

2. The asset-parent constant used the English GIFI 1002 name

The OFX parser defined:

// Wrong — English name
export const ASSET_PARENT_FR =
  'Actif:Actif à court terme:Deposits in Canadian banks and institutions – Canadian currency';

The official IGRF French name for GIFI 1002 is Dépôts dans des banques et des institutions canadiennes – monnaie canadienne. The comment said “kept in English” — that was incorrect; the FR chart has a French name. The mismatch meant the bank-account parent path in FR OFX imports was wrong.

Fix: correct the constant to the official IGRF name.

3. The FR chart used `Impôts à payer` instead of GIFI 2680’s official name

The FR chart generator used Impôts à payer as the parent for all tax sub-accounts (TPS, TVH, TVQ, TVP). The official GIFI 2680 name in the IGRF is Taxes et impôts à payer. The defaults file referenced the wrong parent in every account: line, so the tax-table seed call failed with “account not found” for every FR ledger — a 500 silently logged as container_error.

Fix: replace Impôts à payer with Taxes et impôts à payer throughout the FR chart and defaults file.

4. The T2125 individual chart’s French names were AI-translated

The French expense-line labels for the T2125 individual chart were generated by AI translation, not sourced from the official CRA T4002 guide. Several were materially wrong:

Line	AI translation (wrong)	Official CRA T4002 FR
8860	`Honoraires professionnels`	`Honoraires professionnels (y compris les frais comptables et juridiques)`
9060	`Salaires et avantages sociaux`	`Salaires, traitements et avantages (y compris les cotisations de l'employeur)`
9281	`Frais de véhicule à moteur`	`Dépenses relatives aux véhicules à moteur (sans la DPA)`

Fix: rewrite the CSV builder to fetch the T4002 FR HTML from CRA and extract official line labels directly.

5. No systematic way to rebuild the chart templates

The four base64 template files were maintained by manually running Docker commands and copy-pasting base64 output. There was no script, no documented procedure beyond a comment in each file, and no test to detect when a template was out of sync with its source chart. This is the root cause that made issues 1–4 possible: when a chart generator was fixed or a name corrected, there was no automated step to propagate the change to the template.

Fix: a rebuild-coa-templates.sh — one command rebuilds any or all templates from source .txt files via the running container.

6. No tests validating hardcoded account paths against chart sources

The OFX parser contained string literals like ASSET_PARENT_FR; the defaults file contained account: paths. None were validated against the chart .txt files. A typo, a rename, or a generator fix would silently break runtime behaviour with no CI signal.

Fix: a static path test with two layers:

Layer 1: every hardcoded constant and every tax-table account: path must exist as an open line in the corresponding chart .txt file. Catches name mismatches immediately.
Layer 2: each template is decoded (base64 → gunzip → XML) and its account-name set is compared against the chart source. Catches stale templates in CI without Docker.

Why everything was silent

The container returns rc=0 on some failures. The import command exits 0 even when individual transactions can’t be applied because their accounts don’t exist. The only signal is that the returned blob is identical to the input. The worker had no “unchanged” check — it accepted the unchanged file, logged a write count (a misnomer — it was the parsed count, never verified against actual writes), and returned 200.

INSERT OR IGNORE never throws. The Durable Object’s record method used INSERT OR IGNORE. If a constraint was violated, the row was silently skipped. The method returned void — no way to know whether 0 or N rows were actually written. The logged count was the parsed count, not the DO write count.

The tax-table seed failure is non-fatal by design. A failed seed should not block onboarding, so its error is logged but not surfaced to the UI. A user completing onboarding would see success; the tax-tables page would simply be empty the first time they visited, with no explanation.

Chart generators had no test coverage. The FR generator was never validated to produce the correct account names for GIFI accounts where the official IGRF name differs from the pattern the script derives. No test compared generator output against the official IGRF source.

What I learned

1. A smoke test that exercises a real user path surfaces what unit tests cannot

Unit tests mock the container, mock the DO, mock the chart. They validate logic but not the integration of all layers together. The smoke test drove the full stack — onboard FR ledger → import OFX → check transactions → check tax tables — and failed on every step. None of those failures had a unit test that would have caught them.

Rule: any feature that spans the worker + container + chart files needs a smoke test that exercises the full path. Unit tests are necessary but not sufficient.

2. “Success” is not trustworthy without verification

Three separate layers returned success while silently doing nothing: the container (rc=0, unchanged blob); INSERT OR IGNORE (rows silently dropped); the tax seed (non-fatal failure swallowed). Silent success is indistinguishable from real success until UAT.

Rule: after any write operation, verify the write happened. rowsWritten === 0 after an INSERT is always worth logging. A returned blob identical to the input blob is always worth checking.

3. Hardcoded strings that reference external data need cross-validation tests

Every hardcoded account path is a claim about the shape of the chart. Claims break silently if the chart changes or was never correct to begin with. Cross-validating them against the chart source is cheap (a file read + a Set lookup) and catches an entire class of bugs otherwise only visible at runtime in production.

Rule: if a string constant references an account, a table, a code, or any other externally-defined identifier, there must be a test that verifies the identifier exists in the source of truth.

4. Generated artifacts need a deterministic, scripted rebuild path

The templates were effectively orphaned from their sources. The comment “regenerate when the chart changes” is documentation that gets ignored under time pressure. A script that runs in CI or as a pre-commit check closes the gap between “source was fixed” and “artifact reflects the fix”.

Rule: any generated file (base64 templates, CSVs, compiled outputs) needs an authoritative rebuild script and a test that detects staleness. Documentation alone is not a process.

5. Official government sources must be used directly, not translated by AI

The T4002 French guide is publicly available and machine-readable. Using AI translation instead of fetching the source introduced inaccuracies that, if shipped, would have shown incorrect French labels on tax line items — a compliance issue for Quebec users.

Rule: for any content derived from official regulatory publications (here, CRA’s T4002, IGRF, GIFI), fetch the source document and extract the content programmatically. Do not rely on AI translation for regulatory labels.