Why the knowledge layer belongs underneath AI, not alongside it

The dominant pattern for putting knowledge into AI systems treats knowledge as something attached to the model. Documents are passed into the context window. Passages are retrieved at inference time. Curated corpora are used to fine-tune. All three approaches improve outputs. None of them change what the model is permitted to assert.

In any healthcare workflow that gets inspected - pharmacovigilance, medical affairs, regulatory affairs, parts of clinical R&D, procurement of regulated devices - that limit is the binding constraint. The compliance question is not “is the output good?” It is “can the output be defended, on the record, when an inspector asks where it came from?” Attached knowledge fails that question. The retrieval was suggestive, not binding. The context was advisory, not architectural. The fine-tuning shifted weights but left no traceable line from any specific output to any specific source. When the audit comes, the response is just a response.

This note is an argument for a different architecture: knowledge engineered as a substrate that lives underneath the model, that the model is bounded by, with provenance recorded at generation time. It is also an argument that underneath by itself is not enough. The substrate has to be held to a discipline - what we call the Attic Standard - that makes the artefacts it returns inspectable in their own right. Architecture without discipline is the same problem with extra steps.

01 / What attached knowledge actually does

Retrieval-augmented generation, in its most common shape, fetches passages relevant to a query and inserts them into the model’s context window. The model is then asked to use those passages and to attribute. This works well when the underlying task tolerates ambiguity. It works less well when the task is “produce an output that a regulatory inspector can defend three years from now.”

Three things happen at the seams. First, the retrieval is an approximation - the passages most relevant by embedding similarity are not always the passages that authoritatively bear on the question. Second, the insertion is suggestive - the model can use the passages, ignore them, paraphrase them, or hallucinate around them, and the choice is not externally enforced. Third, the attribution is performative - the model often cites, sometimes cites incorrectly, sometimes invents citations entirely. None of this is a model failure. It is a property of the architecture: knowledge is being asked to influence the model from outside, and influence is not authority.

Fine-tuning does not solve this. A model fine-tuned on a curated corpus has internalised patterns from that corpus. It has not internalised a traceable map from each output to the source that justifies the output. The weights moved. The audit trail did not.

02 / The shape that survives inspection

The EU AI Act, in Article 10, requires that data sets used in high-risk AI systems be relevant, sufficiently representative, and as far as possible free of errors and complete in view of the intended purpose. Article 12 requires automatic logs of operation that enable traceability. The FDA’s good machine learning practice guiding principles for AI/ML-enabled medical devices emphasise traceability of training data, locked-down validated states, and transparent documentation of changes.

None of these requirements are satisfied by retrieval at inference time. None are satisfied by fine-tuning that touches the model. What satisfies them is a different shape: knowledge that lives below the model in a structured form, where every element is addressable, every change is recorded, every version is callable, and every inference can point at the specific elements that grounded it.

This is what we mean by a substrate. The substrate sits underneath the model. It is consulted at generation time - through versioned APIs and an MCP server, against a graph that is queryable in its own right. It binds what the model can assert. And every time it is consulted, the consultation is recorded as part of the response. The recording is not a separate logging artefact; it is part of what the system returns. The provenance is the response.

03 / What “underneath” means in implementation

The architectural difference is not about where the knowledge files live. It is about where the constraint is enforced.

Attached knowledge enforces the constraint at the prompt. The model is asked, please use these documents. Please attribute. Please do not invent. The model frequently does. The model sometimes does not. The compliance regime cannot accept frequently.

A substrate enforces the constraint at the API layer. The model is not asked, it is bounded. Calls into the substrate - over REST or GraphQL, against the underlying graph in Cypher, through an MCP server when the consumer is an AI agent, with conformance shapes shipped in SHACL and exports in JSON-LD - return structured, versioned, citable elements. The model composes those elements into an answer. The composition is the response. The composition is recorded as part of the response. There is no path by which the model can assert clinical content that the substrate does not support, because the model is not given the latitude to do so.

This is what “the knowledge layer belongs underneath” actually means, in implementation terms. It is an architectural commitment about authority. The authority is in the substrate. The model composes; it does not authorise. When the inspector asks where an assertion came from, the answer is in the substrate trace - not in the model’s training data, not in the prompt context, not in a parallel logging system that has to be cross-referenced after the fact.

04 / The Attic Standard - what makes the substrate trustworthy

A model bounded by a substrate is only as inspectable as the substrate it is bounded by. A substrate of unattributed terms, undocumented decisions, and uncalibrated confidence is a faster way to produce indefensible outputs at scale. So the architectural commitment - knowledge underneath, model bounded, provenance in the response - is necessary but not sufficient. The substrate itself has to be held to a discipline.

We hold ours to three commitments, borrowed from Athenian philosophy. Together they are what we call the Attic Standard.

Aletheia - disclosure. Every node in the substrate carries its provenance: source ontology, source version, native identifier, import timestamp. Nothing is unattributed. The inspector who asks where a SNOMED CT concept came from gets a complete answer from the substrate itself, not a reconstruction from logs.

Logos - reasoned account. Every curatorial decision is documented. Every cross-map between coding systems carries a rationale. Every exclusion is explained. When an ICD-10 code maps to a SNOMED concept by rule, the rule is in the substrate. When the mapping required curator judgement, the judgement is in the substrate too, marked as such. The inspector can read why, not only what.

Parrhesia - frank speech. We mark what is curated. We mark what is imported. We mark what is uncertain. We do not pass off the one as the other. A high-confidence cross-map and a low-confidence cross-map look different in the response, because they are different. The substrate does not flatter its own coverage.

These three together are how the architecture earns its claim. Without them, a substrate is only a different package for the same opaque outputs. With them, the substrate becomes an artefact an inspector can read directly. The Attic Standard is the difference between a knowledge graph and a knowledge asset.

05 / A worked example, in pharmacovigilance

Consider adverse event coding for an Individual Case Safety Report. A patient narrative arrives - “developed a rash three days after starting drug X, also reported headache, no prior allergy history.” A signal-detection AI is asked to map the verbatim terms to MedDRA Lower Level Terms, propose the appropriate Preferred Term aggregation, evaluate seriousness, and flag whether the case warrants expedited reporting under the relevant regulator’s timelines.

Each of those four decisions has to be defensible to an inspector. The inspector’s question, in the language of pharmacovigilance, is not “did the model code it correctly?” It is closer to: “what was the basis for this code, and would the basis still hold today given the current state of the MedDRA hierarchy and the current expedited-reporting criteria?”

If the AI produced the code by consulting a substrate that includes the current MedDRA version, the relationship between LLT, PT, HLT, HLGT, and SOC, the case-specific decision rules that determine seriousness, and the regulator-specific expedited-reporting thresholds - and if the response carries a substrate trace showing which elements at which versions were consulted, with what rationale and with what confidence - then the response carries an inspectable artefact. The disclosure on every node satisfies Aletheia. The rationale on each cross-map satisfies Logos. The confidence markers, surfaced honestly, satisfy Parrhesia. The artefact survives the audit by construction.

If the AI produced the code by retrieving MedDRA passages or by being fine-tuned on prior cases, the response carries no such trace. It carries a model output and a hope that the output reflects the source. The hope does not survive inspection. The miscoding propagates. The miscoding shows up in the PSUR. The PSUR shows up in the inspection. And the original architectural decision - to attach knowledge to the model rather than place it underneath, and to skip the discipline that makes the substrate readable - becomes the operational liability that the inspector finds.

06 / What this looks like as engineered software

The substrate that satisfies the regulatory requirement, that holds at the moment the inference is made, that ships with the validation lineage defending its own elements, has to be built and operated as software. We hold the engineering itself to six properties, each of which is testable, each of which is shipped with the release.

Substrate is provenanced - every node carries source, version, identifier, and timestamp. It is idempotent - every loader uses merge semantics, so re-running against unchanged sources produces no diff. It is versioned - every asset has a substrate version that consumers pin against, and releases are diffable. It is auditable - every query traces back to a canonical reference; the audit trail is not a parallel system, it is the substrate itself. It is curated - every standardised asset has a curator-of-record and a documented refresh cadence. And it is reproducible - given the same source releases, a third party building against our engineering spec produces the same substrate.

These six properties are what make the Attic Standard operational rather than philosophical. Aletheia without provenance is a slogan. Logos without versioning is a footnote that drifts. Parrhesia without reproducibility is a claim the substrate cannot back up. The discipline and the engineering are the same commitment, expressed at different layers.

07 / Where this lands

Healthattica ships substrate in two modes. The standardised Fabric is licensed substrate spanning clinical terminologies, regulatory frameworks, procurement taxonomies, device classifications, and jurisdictional metadata - exposed via APIs, an MCP server, a Cypher-queryable graph, JSON-LD exports, and SHACL conformance shapes. Custom builds are bespoke knowledge assets engineered to a specific organisation’s operational use case. Both shapes carry the same six engineering properties. Both are held to the same Attic Standard. Both are the answer to the same question.

The original question this note opens with - can the output be defended, on the record, when an inspector asks where it came from? - has an architecture-shaped answer and a discipline-shaped answer. The architecture is the substrate underneath. The discipline is the Attic Standard. Either one without the other is not enough.

The questions still open in this programme - about ontological composability, about provenance at generation time, about jurisdiction-aware reasoning, about substrate-grounded evaluation - sit on the research page. The shape of the substrate itself, the surfaces it is callable from, and example queries against it sit on the engineering page. The substrate is how the answer to the inspector’s question becomes yes.