Skip to content

Accounting Integrity Tests — automated double-entry / report reconciliation guardrails

Date: 2026-06-26 Status: in progress

Problem / motivation

An external accountant validates the software the way accountants validate any ledger: post a transaction (an invoice), then confirm that everything ties out — debits equal credits, the reports agree with each other and with the general ledger, the customer's balance reflects what they owe, inventory drops by what was sold, and cost of goods sold matches the cost of what left the shelf. This is slow, manual, and only happens at audit time.

This feature encodes that audit as an always-on automated test suite so that every change to invoicing, payments, credits, inventory, costing, or the reporting layer is checked against the fundamental accounting invariants before it ships. If a future change silently breaks the GL (e.g. drops a ledger leg, double-posts AR, or makes a report diverge from the ledger), these tests fail.

In scope

  • A reusable accounting test harness that drives real transactions through the production creation paths (so the SQLAlchemy event handlers fire and post to the ledger exactly as they do in prod) and reads back the real ledger and real report services.
  • Invariant assertions that hold regardless of other data in the shared test database (per-entry balance, trial-balance balance, control-account vs subledger reconciliation), plus delta assertions around a single transaction (before/after), following the established pattern in tests/graphql/reports/test_financial_kpi_service.py.
  • Coverage of the sales cycle the accountant exercises:
  • Sales invoice (inventory item, with tax) → AR / Revenue / Tax payable / COGS / Inventory postings.
  • Customer payment (Receipt) applied to an invoice → Cash / AR; invoice paid balance; client balance.
  • Sales credit note → AR reversal; invoice canceled balance; inventory return; COGS reversal; client balance.
  • Void of an invoice → ledger reversal nets to zero.
  • Cross-report reconciliation: income statement (reports module) vs P&L (ledger module); balance sheet balances (Assets = Liabilities + Equity); trial balance debits == credits (both modules agree).
  • Inventory & cost: on-hand quantity decreases by quantity sold; COGS posted == quantity × unit cost; inventory GL movement == COGS.

Out of scope (and why)

  • Supplier credits — placeholder per accounting-placeholder-audit (no ledger posting); cannot assert correctness on a non-functional path.
  • Bank reconciliation settlement, account landing-page balance, ledger is_posted=False drafts — all placeholder per the audit; excluded.
  • Tax-rate → GL automation — never built (audit #1/#4); we assert tax posts to the resolved tax account, not the unused TaxRate.*_account_id links.
  • Supplier-invoice / AP postings are a stretch: included only if they drive cleanly through the same harness; the sales cycle is the priority.

What's being implemented

New test package tests/graphql/accounting_integrity/:

  • _accounting_harness.py — shared helpers:
  • signed_balance(account, debits, credits) — natural-sign GL balance per account type (asset/expense = debits − credits; liability/equity/revenue = credits − debits).
  • trial_balance(session){account_id: (debits, credits)} over all posted lines, plus (total_debits, total_credits).
  • LedgerSnapshot — captures balances by account/type so a test can assert deltas immune to data other tests committed.
  • assert_all_entries_balanced(session) — every LedgerEntry has Σdebits == Σcredits.
  • builders that wrap the real services (InvoiceMutationService, receipt service, credit service) + build_item + a wired Client.
  • test_double_entry_invariants.py — every posted entry balances; trial balance nets to zero; balance sheet balances; reports/ledger modules agree.
  • test_invoice_accounting.py — one invoice: exact GL legs, AR delta == total, revenue delta == subtotal, tax delta == tax, COGS delta == qty×cost, inventory GL delta == −COGS, on-hand qty delta == −qty.
  • test_payment_accounting.py — receipt: cash delta == +paid, AR delta == −paid, invoice paid_amount/due, client balance.
  • test_credit_accounting.py — credit note: AR reversal, invoice canceled amount, inventory return, COGS reversal, client balance.
  • test_reconciliation.py — AR control account == Σ client balances == Σ invoice due; inventory GL == on-hand valuation; void nets to zero.

Why this layer

Ledger/inventory/balance posting is driven by SQLAlchemy event handlers (InvoiceEventHandler.post_insert etc.) dispatched from AbstractRepository.add_model after flush. Driving the real creation services (with invoice_generation_mode=PROFORM, which skips PAC/PDF/HTTP) exercises the entire posting chain end-to-end — the same code prod runs — and needs no mocking. The accounting period guard is permissive when no period is configured, so tests use today's date freely.

Methodology deltas (intentional)

This adds no schema, GraphQL surface, RBAC path, or worker task — it is test-only. Therefore, by deliberate deviation from the new-resource checklist:

  • No alembic migration — no model/column changes.
  • No Path/Resource enum or backfill — no new resource.
  • No version bump / changelog — the GraphQL/OpenAPI schema is unchanged, so there is nothing for the frontend or integrations to coordinate on.

What still applies and is done: this plan/doc page, task lint + task typecheck-basedpy clean on the new files, the new tests green, and this page wired into mkdocs.yml.

Frontend contract

None — internal quality tooling only.

What shipped

New test package tests/graphql/accounting_integrity/14 tests, all green, driving real InvoiceMutationService / ReceiptService / CreditService postings and reading the real ledger and report services:

  • _accounting_harness.pysigned_balance, snapshot_ledger / type_delta, assert_all_entries_balanced, assert_trial_balance_balanced, on_hand_quantity, inventory_valuation, total_invoice_due, client_balance, entries_for_source, the report runners, and open_books / post_invoice / apply_receipt builders.
  • test_invoice_accounting.py — an invoice debits AR by the gross, credits revenue by the net, credits tax payable, debits COGS at cost, credits inventory at cost, and drops on-hand by the quantity sold; with and without tax.
  • test_payment_accounting.py — a customer receipt debits cash and credits AR by the amount received and lowers the invoice's outstanding balance; full payment settles the invoice.
  • test_credit_accounting.py — a sales credit note reverses one unit across all five legs, returns it to stock, and reduces the invoice's outstanding balance.
  • test_double_entry_invariants.py — every posted entry balances; the trial balance balances in both the reports module and the ledger module; the income statement and the (separately implemented) ledger P&L move identically on a sale.
  • test_reconciliation.py — the AR control account tracks the open-invoice subledger; a fresh client's balance and AR equal what it was invoiced; the inventory GL account moves with valuation as goods are sold; the balance sheet ties to the income statement via Assets = Liabilities + Equity + Net Income; reverting an invoice nets every effect back to zero.

Bug found and fixed

The suite immediately caught a real production bug: LedgerQueryService.trial_balance cast each account id to int(row.account_id) before session.get(Account, …), so the ledger-module trial balance crashed (asyncpg.DataError) whenever any ledger data existed. Fixed by passing the UUID through unchanged. (The reports-module trial balance was unaffected.)

Design notes for the next engineer

  • Tests drive the real creation paths with invoice_generation_mode=PROFORM (and CreditGenerationMode.PROFORM), which skips PAC/PDF/HTTP; the SQLAlchemy event handlers still post the ledger, move inventory, and update balances.
  • The shared session-scoped test DB accumulates data, so assertions are invariants or before/after deltas, never absolute totals.
  • The resolver resets context.session to None on exit — read the session off the resolved service (service.session), not off mock_tenant_context.
  • A customer payment is a Receipt (cash ⇄ AR); the Payment domain is supplier-side (AP). Internal invoices are reversed with delete_invoice; void_invoice is DGI-only (needs a PAC document_id).

Future additions

  • Supplier / AP cycle — supplier-invoice (AP) and supplier-payment postings, once worthwhile; sales-side was prioritised. Supplier credits stay out until that flow is no longer a placeholder (see the placeholder audit).
  • Multi-line / discounted / retention invoices — current scenarios are single-line, zero-discount; add discount and ITBMS-withholding (retention) scenarios to exercise the split-AR + withholding-asset leg.
  • FIFO/LIFO costing — scenarios pin CostingMethod.AVERAGE; add layered-cost COGS once those paths need the same guardrail.
  • Opening balances — once OpeningBalanceImportService posts cleanly, assert inventory GL == valuation in absolute terms (today only the per-sale delta ties out, since test stock is seeded without an opening journal).
  • Denormalized ClientBalance drift — receipts/credits do not update ClientBalance, so it diverges from true AR after payments; the suite only asserts it right after an invoice. Worth reconciling in the product, then tightening the test.