openEHR Finally Explained for Developers Who Just Want to Ship Code

March 12, 2026

18 min read

openEHR has a reputation problem. The technology is genuinely powerful — it solves real problems around long-term clinical data modeling. But the documentation reads like it was written by committee (because it was), the terminology feels deliberately opaque, and every explanation assumes you already understand the thing being explained.

This blog fixes that. If you are a developer, architect, product manager, or CTO trying to understand openEHR well enough to make decisions about it, this is the only guide you need. No academic jargon. No specification references. Just the core concepts explained with analogies, diagrams, and real examples — in the order you actually need to learn them.

Visual showing dense technical specification being translated into simple building blocks: Template, Archetype, and Composition

The One-Paragraph Version

openEHR is a way to store clinical data so that the data outlives the software that created it. It does this by separating "what clinical concepts exist" (archetypes) from "how your app captures them" (templates) from "what a specific patient's data actually is" (compositions). This separation means you can swap your entire EHR application and keep every patient record intact, queryable, and meaningful. That is the entire value proposition.

The Five Concepts You Actually Need

openEHR has dozens of concepts in its specification. You need five. Everything else is implementation detail that matters only when you are deep in development. Here they are, in the order you should learn them.

Five-layer pyramid showing the openEHR hierarchy from Reference Model at the base through Archetypes, Templates, Compositions, to EHR at the top

Concept 1: The Reference Model (The Foundation You Never Touch)

Think of the Reference Model as the grammar of a language. English has nouns, verbs, adjectives — these are the building blocks that every sentence uses, but you do not think about grammar when you speak. The openEHR Reference Model defines the building blocks that every piece of clinical data uses.

The Reference Model says things like:

There is a thing called an OBSERVATION — it records something measured or observed about a patient (blood pressure, heart rate, lab result)
There is a thing called an EVALUATION — it records a clinical opinion or assessment (diagnosis, risk assessment, prognosis)
There is a thing called an INSTRUCTION — it records something that should be done (prescribe this medication, order this test)
There is a thing called an ACTION — it records something that was actually done (medication administered, procedure performed)
There are data types like DV_QUANTITY (a number with units, like 120 mmHg), DV_CODED_TEXT (a value from a terminology, like a SNOMED CT code), and DV_DATE_TIME

Why this matters for you: You will never edit the Reference Model. But you need to know it exists because it explains why archetypes are organized the way they are. When you see an archetype called openEHR-EHR-OBSERVATION.blood_pressure.v2, the "OBSERVATION" part comes from the Reference Model — it tells you this archetype records something that was observed, not something that was ordered or done.

The analogy: The Reference Model is like the rules of chess. You need to know them to play, but you do not redesign them for each game. You just play within them.

Concept 2: Archetypes (The Universal Clinical Definitions)

An archetype defines a clinical concept completely. Not partially. Not "here are the fields your form needs." Completely — every possible thing that could be recorded about that concept, in any clinical context, in any country, in any specialty.

The blood pressure archetype, for example, defines:

Tree diagram showing the internal structure of a blood pressure archetype with Data fields (systolic, diastolic), Protocol fields (method, cuff size, location), and State fields (position, exertion)

The data you capture: systolic pressure, diastolic pressure, mean arterial pressure, pulse pressure
The context around the measurement: patient position (sitting, standing, lying), exertion level, sleep status
The protocol used: measurement method (auscultation, machine, palpation), cuff size, measurement location (left arm, right arm, thigh)
Valid ranges: systolic must be 0-1000 mmHg (anything outside that is probably a data entry error)
Terminology bindings: each field maps to codes in SNOMED CT, LOINC, or other standard terminologies

Here is the critical insight: archetypes are designed by international clinical consensus, not by software vendors. Clinicians, informaticians, and health IT experts from around the world collaborate on archetype definitions through the Clinical Knowledge Manager (CKM). A blood pressure archetype is the same whether you are building an EHR in Norway, a research platform in Australia, or a telemedicine app in India.

Every archetype has a unique identifier like openEHR-EHR-OBSERVATION.blood_pressure.v2. Read it like this:

openEHR-EHR — this is an openEHR clinical archetype
OBSERVATION — it is an observation (from the Reference Model)
blood_pressure — the clinical concept
v2 — version 2 of this archetype

Why this matters for you: You do not create archetypes. You use the ones that already exist. There are hundreds of internationally-reviewed archetypes covering vital signs, lab results, medications, diagnoses, procedures, and more. Your job is to find the right ones for your use case, not to reinvent them.

The analogy: An archetype is like an international building code. It defines every possible safety requirement for a type of structure. You do not need all of them for your specific building, but the standard covers every scenario so that any building inspector anywhere in the world can understand your building.

Visual analogy showing archetypes as cookie cutters (universal shape), templates as recipes (your specific instructions), and compositions as filled lunchboxes (actual patient data)

Concept 3: Templates (Your Local Configuration)

If archetypes define everything that could be recorded, templates define what you actually will record in your specific application.

A blood pressure archetype has 15+ fields. Your GP clinic app probably needs five: systolic, diastolic, measurement location, position, and maybe a comment. An ICU monitoring system might need ten. A research study might need all fifteen.

A template takes one or more archetypes and constrains them for your specific use case:

Before and after showing a full archetype with 15+ fields being constrained by a template to just 5 needed fields, with unused fields grayed out

Hide fields you do not need (remove pulse pressure, mean arterial pressure, tilt, exertion)
Set defaults (position defaults to "sitting" because that is your clinic's standard)
Restrict value sets (location limited to "left arm" and "right arm" only — you never measure on the thigh)
Combine multiple archetypes into one form (blood pressure + heart rate + body temperature = "Vital Signs" template)
Make fields mandatory or optional for your context

Templates are local to your organization. They are not shared internationally. Hospital A's "Vital Signs" template can be completely different from Hospital B's — but both are built from the same international archetypes. That is the key. The underlying data definitions are universal even when the forms look different.

Why this matters for you: Templates are where your implementation work happens. You choose which archetypes to use, constrain them for your context, and combine them into the documents your clinicians will fill out. Most openEHR tooling (form builders, template designers) operates at the template level.

The analogy: If an archetype is a full restaurant menu, a template is your order. The menu has 200 items. You pick 8 for tonight's dinner. The menu is the same for everyone, but every table orders something different.

Concept 4: Compositions (The Actual Patient Data)

A composition is a single clinical document for a single patient. When a nurse records vital signs, the data that gets saved is a composition. When a doctor writes a discharge summary, that is a composition. When a lab system reports results, each report is a composition.

A composition is structured according to a template. If your "Vital Signs" template includes blood pressure (systolic, diastolic, position) and heart rate, then a vital signs composition for Patient Jane Doe contains the actual values: systolic 120, diastolic 80, position sitting, heart rate 72.

In technical terms, a composition is a JSON (or XML) document that you POST to the openEHR Clinical Data Repository (CDR) via a REST API. Here is a simplified example of what a blood pressure composition looks like:

{
  "archetype_node_id": "openEHR-EHR-COMPOSITION.encounter.v1",
  "name": {
    "value": "Vital Signs"
  },
  "content": [
    {
      "archetype_node_id": "openEHR-EHR-OBSERVATION.blood_pressure.v2",
      "data": {
        "events": [
          {
            "data": {
              "items": [
                {
                  "name": {
                    "value": "Systolic"
                  },
                  "value": {
                    "magnitude": 120,
                    "units": "mm[Hg]"
                  }
                },
                {
                  "name": {
                    "value": "Diastolic"
                  },
                  "value": {
                    "magnitude": 80,
                    "units": "mm[Hg]"
                  }
                }
              ]
            }
          }
        ]
      },
      "state": {
        "items": [
          {
            "name": {
              "value": "Position"
            },
            "value": {
              "value": "Sitting"
            }
          }
        ]
      }
    }
  ]
}

Yes, it is verbose. That is intentional. Every piece of data carries its own semantic context. You do not need an external schema to interpret it — the composition is self-describing. This is why openEHR data survives application changes: the meaning is embedded in the data, not in the application code.

Why this matters for you: Compositions are what your application creates and what your CDR stores. Your REST API calls will be creating and querying compositions. Every composition is automatically versioned — edit a blood pressure reading and the CDR keeps both the original and the correction, with full audit trail.

The analogy: If the archetype is the blank form design, and the template is which form you chose to use, the composition is the filled-in form in the patient's file. It is the actual data.

Concept 5: AQL (How You Get Data Back Out)

AQL — Archetype Query Language — is how you query data from an openEHR CDR. It looks like SQL if SQL understood clinical concepts instead of tables and columns.

Annotated AQL query with each segment labeled in plain English: path navigation, data selection, and source specification

Here is a real AQL query that retrieves all blood pressure readings:

SELECT
  e/ehr_id/value as patient_id,
  o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/magnitude as systolic,
  o/data[at0001]/events[at0006]/data[at0003]/items[at0005]/value/magnitude as diastolic
FROM
  EHR e CONTAINS OBSERVATION o[openEHR-EHR-OBSERVATION.blood_pressure.v2]
WHERE
  o/data[at0001]/events[at0006]/time/value &gt; '2025-01-01'

This looks intimidating. Let us break it down piece by piece.

The FROM clause says: "Look in all EHR records that contain a blood pressure observation." This is like SQL's FROM patients JOIN observations, but it uses the archetype identifier instead of table names. The huge advantage: this query works on any openEHR CDR in the world, regardless of how the database is structured internally.

The SELECT paths navigate through the archetype structure using at-codes. Those cryptic at0001, at0004 codes? They are just node identifiers within the archetype — think of them as stable addresses. at0004 always means "systolic" in the blood pressure archetype, everywhere, forever. You do not need to memorize them — your archetype tool shows you which code maps to which concept.

Reading the path o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/magnitude:

o/ — start from the observation
data[at0001] — go into the data section
events[at0006] — pick the measurement event (a single reading)
data[at0003] — get the actual measurement data
items[at0004] — grab the systolic value specifically
value/magnitude — extract the numeric value

Why this matters for you: AQL is what makes openEHR data useful. Without it, you have a repository of self-describing documents. With it, you can run analytics, build dashboards, feed ML pipelines, and generate reports — all using archetype-based paths that work across any openEHR implementation.

The analogy: AQL is like asking a librarian for books. In SQL, you say "go to shelf 3, row 7, position 12" (physical location). In AQL, you say "find me all mystery novels by Agatha Christie published after 1950" (semantic meaning). The library can reorganize its shelves and your query still works.

How It All Fits Together

Here is the complete picture of how data flows through an openEHR system:

Seven-step flowchart showing data flow from clinician form entry through composition creation, REST API, CDR validation against templates and archetypes, to storage and AQL querying

A clinician fills out a form in your application. The form was built from a template.
Your application creates a composition — a JSON document containing the patient's data, structured according to the template.
The composition is sent to the CDR via a standard REST API (POST /ehr/{ehr_id}/composition).
The CDR validates the composition against the template to ensure all required fields are present and all values are within allowed ranges.
The template references the underlying archetypes, which provide the full semantic definitions and terminology bindings.
The data is stored in the CDR, fully versioned and audited. Every change creates a new version while preserving the original.
Anyone can query the data using AQL at any time, regardless of which application created it or which template was used — because AQL queries against the archetype definitions, not the templates.

Step 7 is the magic. Data entered through Template A and data entered through Template B can both be queried with the same AQL query, because both templates are built from the same archetypes. The blood pressure systolic value is at the same archetype path regardless of which form captured it.

The Three Things That Confuse Everyone

Confusion 1: "Why are there archetypes AND templates? Why not just one?"

Because they serve different audiences and change at different speeds. Archetypes are designed by international clinical consensus — they change slowly (years between major versions) and are the same worldwide. Templates are designed by local implementers — they change frequently as forms are updated, new departments are added, and requirements evolve.

If you put local customization in the archetype, every hospital in the world would need their own version of "blood pressure." If you put the full clinical model in templates, every hospital would need to independently define what "systolic" means. The split gives you global consistency (archetypes) with local flexibility (templates).

Confusion 2: "What do at0001, at0004, etc. actually mean?"

They are just stable identifiers — think of them as permanent addresses within an archetype. at0004 in the blood pressure archetype will always mean "systolic." The number itself is meaningless — it was assigned sequentially when the archetype was first designed. What matters is that it never changes, so any data recorded using at0004 in 2020 means the same thing as data recorded using at0004 in 2030.

You do not need to memorize at-codes. Every archetype tool shows you the human-readable name next to the code. In practice, you look up the path once, paste it into your AQL query, and forget about it.

Confusion 3: "How is this different from FHIR?"

FHIR is designed for data exchange between systems. openEHR is designed for data storage and long-term clinical data management. They solve different problems and work well together.

A useful mental model: FHIR is like HTTP — a protocol for moving data between systems. openEHR is like a database schema — a structure for storing and querying data permanently. You might use FHIR to exchange data with external systems while using openEHR as your internal clinical data repository. Many production systems do exactly this.

The key technical difference: FHIR resources have a fixed structure defined by HL7 (with extension mechanisms for customization). openEHR archetypes have a clinically-defined structure that is separated from the underlying data model. This makes openEHR more flexible for deep clinical modeling but more complex to get started with. FHIR is easier to start but harder to extend cleanly. Neither is "better" — they are designed for different roles in a health data architecture.

The Minimum You Need to Start Building

If you want to build something on openEHR today, here is your minimum viable knowledge:

Get a CDR running. EHRbase (open-source) with Docker: docker compose up and you have a CDR at localhost:8080. Or use a cloud CDR from Better or another vendor.
Find your archetypes. Go to the Clinical Knowledge Manager (CKM) at ckm.openehr.org. Search for the clinical concepts you need. Download the archetypes.
Build a template. Use a template designer to select your archetypes, constrain the fields you do not need, and combine them into a clinical document template. Upload the template to your CDR.
POST a composition. Build a JSON document structured according to your template and POST it to the CDR's REST API. The CDR validates it and stores it.
Query with AQL. Write AQL queries using archetype paths to retrieve data. The CDR returns typed, structured results.

That is the complete workflow. Everything else — terminology binding, versioning, access control, FHIR integration — is important but can be learned incrementally as your implementation matures.

Why This Matters Beyond the Technology

openEHR is not just a technical standard. It is a bet on a specific future: one where patient data belongs to the patient and the healthcare system, not to the software vendor.

When a hospital switches from Epic to Cerner today, they face a multi-year data migration project. Decades of clinical records must be extracted, mapped, transformed, and loaded into a new proprietary format. Clinical nuances are lost. Audit trails are broken. Billions of dollars are spent.

With openEHR, the data is stored in a vendor-neutral, archetype-based format. Switching CDR vendors means changing the storage engine while keeping every patient record, every clinical composition, every audit trail intact. The data was never in a proprietary format to begin with.

That is the boring but useful truth of openEHR. It is not exciting technology. It is infrastructure — the kind that matters most when you need to change everything else around it.