Forms Registry — Process Overview

Why this exists

Forms are documentary truth.

A funeral home's forms are decades-old artifacts of actual practice. Every field exists because someone — a regulator, an insurer, an industry custom — decided it was worth capturing. That makes the forms a far better source of truth than our hypotheses about what funeral software should do.

Cataloguing Kearney's ~150 forms is the cheapest, fastest way to validate three things at once.

01

Module catalog

Our workflow modules are hypotheses. The forms tell us which ones are real, which are wrong, and which we missed entirely.

02

Design work

Matt cannot design intake screens without knowing what data is captured at each step. The forms answer that exactly.

03

Schema decisions

What's universal versus tenant-specific can only be drawn from real data. The forms make the boundary visible.

The shape of the work

Two workflows, two owners.

Per-form work happens continuously, form by form. Cross-form analysis happens at checkpoints. The two are coupled but not gated — ingesting can continue while analyses still run on the accumulated batch.

Owner — Zareef

Per-Form Workflow

Continuous · one form at a time

For each form: ingest, map to the database schema, map to the workflow modules, with a human review checkpoint after each AI step. Result: an approved form file ready for analysis.

Owner — Philip

Cross-Form Analysis

At checkpoints · every ~20–30 forms

Synthesises per-form work into architectural decisions: catalog refinements, candidate new modules, schema additions, and the workflow composition for each case flow.

The Per-Form Pipeline

One form, three AI prompts, three human checkpoints.

Each form passes through six steps, alternating AI work with human review. AI does the cataloguing, schema-matching, and module-matching; humans confirm, correct, and resolve any ambiguity before the next step runs.

Alternating AI and Human Steps · Each Human Review Gates the Next AI Step

1

AI · Ingest

Catalogue the form

AI reads the PDF and produces a structured form file: every field with its label, data type, whether it's required, whether it's PHI, whether it's handwritten, and proposed relationships to other forms already in the registry.

What AI sees: the form's title, fields, signature blocks, layout. What it produces: a complete catalogue entry plus a list of "Open Questions" for anything it cannot determine from the form alone.

2

Human · Review

Resolve open questions

Zareef opens the form file and verifies the catalogue against the actual PDF. Open Questions get answered (or batched for a Kearney follow-up). Anything obviously wrong gets corrected in place.

What humans do that AI can't: distinguish "the form says X" from "the form is used for X." Forms are documentary, but their use is operational — only the funeral home staff know that part.

3

AI · Schema map

Map each field to the database

AI takes every field and classifies its match against the existing database schema: exact match, semantic match with a different shape, partial match (lossy), or no match. For unmatched fields, it proposes a destination — a new column, a new table, the JSONB metadata column, or the workflow envelope.

Confidence scoring is mandatory. AI must mark every match high, medium, or low — and explain low confidence in one sentence. Fake confidence is worse than honest uncertainty.

4

Human · Review

Confirm matches, resolve drift

Zareef confirms exact matches, ratifies proposed destinations for new fields, and resolves any "drift flags" — cases where the same field on a different form was mapped differently. Drift cannot be silently overwritten; it must be reconciled explicitly.

The most important call here: universal versus tenant-specific. Universal goes in a real column; tenant-specific goes in a flexible metadata bag. Getting it wrong fragments the data later.

5

AI · Module map

Map each field to a workflow module

AI takes every field and classifies its producing source from a nine-value taxonomy: which module writes it, captures it via UI, pre-populates it from earlier case data, derives it, or — explicitly — that no producer can be identified (an orphan).

Orphans are the most valuable output. A field appearing on many forms with no producing module is a signal that a module is missing from our catalog. Clusters of orphans become candidate new modules.

6

Human · Review

Triage orphans, flag catalog gaps

Zareef reviews producing-source assignments, decides whether orphans are genuinely orphan or a missed mapping, and surfaces catalog refinement candidates — cases where a workflow module's declared inputs and outputs appear incomplete.

This is where the catalog grows. Confirmed gaps become candidate ADRs (architecture decisions). The catalog doesn't refine itself; it refines by accumulated evidence.

✓

Form approved

Ready for cross-form analysis

The form file is now a structured, schema-mapped, module-mapped, human-reviewed artifact. It feeds the cross-form analysis cadence — but it is never frozen. Every prompt is idempotent: re-running them on the form is safe as the schema, catalog, or sibling-form mappings evolve.

Division of labour

What AI does, what humans must do.

AI handles the high-volume, pattern-matching work — extracting fields, scoring matches, proposing taxonomies. Humans handle judgment calls that compound across the registry: architectural decisions, semantic disambiguation, and anything where a wrong answer would silently corrupt future work.

AI · Heavy lifting

Read and structure each PDF — every field, every signature block, every relationship.
Match form fields to the database schema across thousands of comparisons.
Classify producing sources via a strict taxonomy — no inventing new categories.
Detect duplicates, near-duplicates, and drift across forms already in the registry.
Score every suggestion with explicit confidence and reasoning.
Cluster orphan fields to surface candidate new modules.
Idempotently re-run on the same form as schema or catalog evolves.

Human · Judgment

Confirm AI's proposals — every checkpoint is a ratification gate.
Resolve drift between conflicting mappings on different forms.
Decide universal versus tenant-specific data classification.
Interpret operational reality — what the form is actually used for.
Approve or reject candidate new modules surfaced by orphan clusters.
Write the ADRs that record catalog and schema changes.
Sit with the funeral home expert and capture the un-formalized steps.

Cross-Form Analysis

Where per-form work compounds.

One form tells you a little. Twenty forms together tell you whether the module catalog is right, whether the schema is right, whether the case-flow sequencing matches operational reality. Analysis runs at checkpoints — not after every form, and not only at the end.

When analysis runs

Default cadence

After every 20–30 newly approved forms accumulate. Pattern strength grows with batch size.

At flow boundaries

When all forms for a case flow (standard burial, direct cremation) reach approved status.

Orphan threshold

When clusters of unmapped fields exceed ~10 with similar candidate-module hypotheses.

Before major change

Before any catalog or schema change is contemplated, so the change is informed by the current cross-form view.

The six analysis passes

§6.1

Module catalog reconciliation

For each workflow module: does the catalog's declared inputs and outputs match what the form mappings actually imply?

→ analysis-vs-modules.md

§6.2

Module catalog gap analysis

Cluster the orphan fields. Multiple forms missing the same producer is strong evidence for a candidate new module.

→ candidate ADR proposals

§6.3

Envelope schema reconciliation

Resolve semantic conflicts in proposed envelope keys across forms. Promote stable patterns; merge subtle duplicates.

→ analysis-vs-envelope.md

§6.4

Sequencing reconciliation

Compare Kearney's actual case-flow sequence to our hypothesized starter templates. Revise the templates where reality diverges.

→ sequencing-<flow>.md

§6.5

Medium analysis

What stays paper, what becomes digital, what gets scanned and attached. Informs gated approval and signature flows.

→ analysis-vs-envelope.md

§6.6

Workflow composition (deferred)

For each case flow: the complete module pipeline, the fields produced by it, and the fields requiring manual UI entry. Runs after at least one flow is fully approved.

→ workflow-composition-<flow>.md

What comes out the other end

Outputs that unblock the rest of the project.

The forms registry is not a documentation exercise. Every output feeds a downstream decision — design work, schema migration, ADR proposals, or the customer-facing workflow configuration.

Workflow configurations

For each case flow, the complete module pipeline plus the list of UI inputs Matt needs to design. This is the customer-facing deliverable — what Kearney's workflow actually looks like.

Candidate ADRs

Catalog refinements, new modules, envelope schema changes, schema migrations. All recorded as architecture decisions, not absorbed silently into code.

Design inputs for Matt

Concrete field lists for intake screens, prep room views, family-facing surfaces. Designed against real data, not assumed data.

An asset for every future tenant

Forms catalogued for Kearney are the starting point for funeral home #2, #3, #4. Subsequent onboarding of new funeral homes gets dramatically faster as the vocabulary accumulates.

Where this fits

Kearney first. Everyone else benefits.

Kearney is the first tenant, so its forms are the seed. Every form processed teaches the AI prompts more about what real funeral home data looks like. By tenant #2, most forms will match existing entries — and most workflow configurations will already be defined. The registry is the asset that compounds across tenants; Kearney's work is the cost of building it.

This is also the discovery moment for the product itself. The forms may reveal that some "MVP" features should slip to Phase 2 — or that some "Phase 2" features are actually mandatory. That's the exercise working correctly, not failing.