Why You Need a FHIR Go-Live Checklist
Launching a FHIR server into production without a comprehensive checklist is like deploying a banking API without security testing. Healthcare data is regulated, clinical workflows depend on it, and interoperability means your server will interact with systems you have never tested against. Every missed item on this list is a potential compliance finding, production outage, or data integrity issue.
This checklist covers 47 items across 8 categories, organized by priority. Critical items will block certification or cause data loss. Important items affect performance and user experience. Nice-to-have items improve operational maturity. Use this as your go-live gate—do not launch until every critical item is verified.
Category 1: Server Setup (Items 1–6)
| # | Item | Priority | Verification |
|---|---|---|---|
| 1 | CapabilityStatement published at /metadata | Critical | GET /metadata returns valid JSON with all supported resources and operations |
| 2 | Base URL is stable and versioned | Critical | URL will not change; clients can hardcode it |
| 3 | TLS 1.2+ enforced on all endpoints | Critical | Non-TLS connections rejected; certificate valid and not self-signed |
| 4 | CORS configured for browser-based SMART apps | Important | Allowed origins, methods, and headers correctly specified |
| 5 | Content-Type negotiation works | Important | Server returns JSON for application/fhir+json, XML for application/fhir+xml |
| 6 | Server returns FHIR version in response headers | Nice-to-have | fhirVersion in CapabilityStatement matches R4 (4.0.1) |
The CapabilityStatement is your server's contract with the world. Every client will read it to discover what your server supports. Verify that it accurately reflects your implementation—do not list resources or operations you have not actually implemented. Inferno tests will catch discrepancies.
Category 2: Authentication and Authorization (Items 7–14)
| # | Item | Priority | Verification |
|---|---|---|---|
| 7 | SMART on FHIR discovery at /.well-known/smart-configuration | Critical | Returns valid JSON with authorization and token endpoints |
| 8 | OAuth 2.0 authorization code flow works | Critical | Patient-facing apps can complete the full auth flow |
| 9 | SMART scopes enforced | Critical | Token with patient/Observation.read cannot access Conditions |
| 10 | Token expiration and refresh implemented | Critical | Expired tokens return 401; refresh tokens issue new access tokens |
| 11 | Backend services auth (client credentials) works | Important | System-to-system clients can authenticate with signed JWT assertion |
| 12 | Patient context in token enforced | Critical | Patient A's token cannot access Patient B's data |
| 13 | PKCE support for public clients | Important | Authorization server validates code_verifier against code_challenge |
| 14 | Token introspection or JWT validation on every request | Critical | No request bypasses authentication; invalid tokens return 401 |
Authentication failures are the most common source of Inferno test failures. Test every scope combination, verify that patient compartment isolation works, and confirm that expired tokens are properly rejected. The FHIR interoperability standards require SMART on FHIR compliance for all patient-facing APIs.
Category 3: Resource Conformance (Items 15–22)
| # | Item | Priority | Verification |
|---|---|---|---|
| 15 | US Core profiles conformant | Critical | All required US Core profiles pass Inferno validation |
| 16 | Required elements populated | Critical | Must-support elements have data; required elements never null |
| 17 | CodeableConcept bindings correct | Critical | Coded elements use correct code systems (SNOMED, LOINC, RxNorm) |
| 18 | Reference integrity maintained | Critical | All references resolve to existing resources; no broken links |
| 19 | Patient resource complete | Critical | Name, gender, birthDate, identifier present per US Core Patient |
| 20 | USCDI v3 data classes supported | Important | All required USCDI data classes have corresponding resource support |
| 21 | Extensions properly namespaced | Important | Custom extensions use your organization's URL namespace, not FHIR core |
| 22 | Narrative generation works | Nice-to-have | Resources include human-readable text.div for display |
Run the Inferno US Core test suite early—ideally as part of your CI/CD pipeline. Each test failure identifies a specific conformance gap. Fixing these gaps iteratively is faster than trying to pass all tests at once.
Category 4: Search (Items 23–30)
| # | Item | Priority | Verification |
|---|---|---|---|
| 23 | Required search parameters implemented | Critical | US Core specifies required search params per resource type |
| 24 | _include and _revinclude work | Important | Observation?_include=Observation:patient returns Patient resources |
| 25 | Pagination with _count and next links | Critical | Bundle.link with relation=next works for large result sets |
| 26 | _total parameter supported | Important | Returns total count in Bundle.total for search results |
| 27 | Date search works correctly | Critical | Date comparators (gt, lt, ge, le) work; timezone handling correct |
| 28 | Token search (code systems) works | Critical | Search by system|code returns correct results |
| 29 | Chained search parameters work | Important | Observation?patient.name=Smith returns correct results |
| 30 | Search returns OperationOutcome for invalid params | Important | Invalid search parameters return 400 with descriptive OperationOutcome |
Search is where most real-world integrations break. Test with realistic data volumes—searching through 10 records works differently than searching through 100,000. Verify that pagination remains consistent even when new data arrives between page requests. Building these search capabilities correctly is essential for healthcare software that will interact with external systems.
Testing Search at Scale
Search performance at scale is the area where development environments lie to you the most. A search that returns in 50ms against 100 patients takes 5 seconds against 100,000 patients if your indexes are not properly configured. Before go-live, load your FHIR server with realistic production data volumes and run the following search performance tests:
- Patient search by name: Must return in under 500ms for the most common names in your population.
- Observation search by patient + date range: Must return in under 1 second for patients with 5+ years of lab history.
- Condition search with _include: Must handle the include expansion without timing out.
- Pagination through large result sets: Page 10 of 50 pages should take the same time as page 1.
Index the search parameters that US Core specifies as required. For PostgreSQL-backed FHIR servers, ensure that composite indexes exist for the most common search parameter combinations (e.g., patient + date + code for Observation).
Category 5: Operations (Items 31–36)
| # | Item | Priority | Verification |
|---|---|---|---|
| 31 | $everything operation works for Patient | Important | Returns all resources in patient compartment |
| 32 | $export (Bulk Data) works | Critical (if required) | Asynchronous export produces valid NDJSON files |
| 33 | $member-match works (for payers) | Critical (if required) | Correctly identifies members across payer systems |
| 34 | $validate works | Nice-to-have | Validates resources against profiles and returns issues |
| 35 | Transaction bundles supported | Important | POST Bundle with type=transaction processes atomically |
| 36 | Batch bundles supported | Important | POST Bundle with type=batch processes independently |
Category 6: Error Handling (Items 37–41)
| # | Item | Priority | Verification |
|---|---|---|---|
| 37 | OperationOutcome returned for all errors | Critical | Every 4xx and 5xx response includes a FHIR OperationOutcome |
| 38 | HTTP status codes are correct | Critical | 404 for not found, 401 for unauthorized, 403 for forbidden, 422 for validation failure |
| 39 | Retry-After header on 429 responses | Important | Rate-limited responses include when to retry |
| 40 | Graceful handling of malformed requests | Important | Invalid JSON/XML returns 400, not 500 |
| 41 | Error messages do not leak PHI | Critical | Error responses never include patient data in messages or stack traces |
Category 7: Monitoring (Items 42–45)
| # | Item | Priority | Verification |
|---|---|---|---|
| 42 | Request latency tracked (p50, p95, p99) | Important | Dashboards show latency distribution; alerts on p99 > threshold |
| 43 | Error rate monitored | Critical | 5xx error rate alerts when exceeding 0.1% |
| 44 | Uptime SLO defined and measured | Important | 99.9% uptime target with automated monitoring |
| 45 | Audit logging enabled (FHIR AuditEvent) | Critical | Every data access logged with who, what, when, where |
Monitoring is not optional for healthcare APIs. HIPAA requires audit logging of all access to protected health information. Build audit logging from day one, not as an afterthought. Use FHIR AuditEvent resources for structured logging that can be queried and analyzed. Track latency closely—if your clinical decision support systems depend on your FHIR server, slow responses directly impact clinical workflows.
Category 8: Compliance (Items 46–47)
| # | Item | Priority | Verification |
|---|---|---|---|
| 46 | Inferno test suite passes | Critical | US Core, SMART App Launch, and applicable IG test suites pass |
| 47 | Information blocking provisions met | Critical | No practices that unreasonably prevent access, exchange, or use of EHI |
The 21st Century Cures Act prohibits information blocking—practices that prevent, materially discourage, or otherwise inhibit the access, exchange, or use of electronic health information. Your FHIR server must not impose unnecessary barriers to data access. This includes excessive rate limiting without clinical justification, requiring manual approval for API access that should be automated, charging unreasonable fees for data access, and imposing technical requirements that are not necessary for security or privacy. ONC has defined eight exceptions to the information blocking provisions (preventing harm, privacy, security, infeasibility, health IT performance, content and manner, fees, and licensing), but each exception has strict conditions that must be documented. If you restrict access, you must be able to articulate which exception applies and why.
Testing Strategy
Do not wait until the week before go-live to run these checks. Integrate them into your development process:
- CI/CD integration: Run the Inferno US Core test suite against your FHIR server on every deployment to your staging environment. Treat test failures as build failures.
- Synthetic patient data: Load your test environment with Synthea-generated patient data that covers diverse demographics, conditions, and data volumes. Synthea generates realistic FHIR bundles that exercise all US Core profiles.
- Contract testing: If external systems will connect to your FHIR server, establish contract tests that verify the expected request/response patterns. Use tools like Pact or custom FHIR test clients to verify compatibility.
- Chaos testing: Simulate failure modes—database connection loss, auth server downtime, network latency spikes. Verify that your FHIR server degrades gracefully rather than crashing or returning incorrect data.
- Security penetration testing: Hire an independent security firm to test your FHIR server for common vulnerabilities: SQL injection in search parameters, IDOR (insecure direct object reference) attacks on patient data, JWT manipulation, and scope bypass. Healthcare data breaches carry severe penalties under HIPAA—invest in security testing proportional to the risk.
Priority Summary
| Priority | Count | Action |
|---|---|---|
| Critical | 25 | Must pass before go-live. No exceptions. |
| Important | 15 | Should pass before go-live. Document exceptions with remediation plan. |
| Nice-to-have | 7 | Plan for post-launch iteration. |
Go-Live Readiness Assessment
Before flipping the switch, run through this quick readiness assessment:
- Inferno passes: Run the full US Core and SMART App Launch test suites. All required tests must pass. Document any known failures with justification.
- Load test completed: Simulate expected production traffic for at least 2 hours. Verify that response times stay within SLO under load.
- Security review done: An independent security review has validated authentication, authorization, and data isolation.
- Incident response plan: The team knows how to handle a FHIR server outage. On-call rotation is staffed. Runbooks exist for common failure modes.
- Rollback plan: If the new FHIR server causes issues, you can revert to the previous system within your defined RTO.
For teams working with both HL7 and FHIR standards, ensure that the FHIR server does not disrupt existing HL7 v2 interfaces during the transition period.
Post-Launch Operations
Going live is not the finish line—it is the starting line. After launch, establish these operational practices:
- Weekly Inferno regression: Run the full test suite weekly to catch conformance drift as new features are deployed.
- Monthly security review: Audit access logs for unusual patterns—bulk data access from unexpected IPs, scope escalation attempts, or excessive failed authentication attempts.
- Quarterly performance baseline: Measure and record p50, p95, and p99 latencies for each major search operation. Compare against previous quarters to detect degradation before it becomes a problem.
- Annual compliance audit: Review your FHIR implementation against the latest USCDI version requirements. ONC updates USCDI annually, and new data classes may require new resource support.
- Client onboarding process: Document a clear process for external systems to register as SMART clients, obtain credentials, and test their integration. A frictionless onboarding process demonstrates compliance with information blocking provisions—making it difficult for external systems to connect is itself a form of information blocking.
For organizations that also maintain HL7 interface engines, coordinate monitoring across both FHIR and HL7 v2 systems. A failure in the FHIR layer may cascade to downstream HL7 v2 consumers if data flows through both systems.
Frequently Asked Questions
What is the most common reason FHIR implementations fail go-live?
Authentication and authorization issues. SMART on FHIR has many moving parts—discovery documents, scope enforcement, token expiration, patient context isolation. Teams often get basic auth working but miss edge cases like expired tokens, scope-limited access, or PKCE for public clients.
How long does it take to pass the Inferno test suite?
For a well-built FHIR server with proper US Core conformance, initial Inferno testing takes 2-4 weeks of iteration. Most failures come from missing must-support elements, incorrect code system bindings, and search parameter gaps. Plan for this iteration time in your project schedule.
Is $export required for all FHIR servers?
No. Bulk Data Export ($export) is required for servers that support the SMART Backend Services IG and for payer APIs under CMS-0057-F. For a basic patient-facing FHIR server, it is important but not always required. Check your specific certification or regulatory requirements.
What uptime SLO should a FHIR server target?
Most healthcare organizations target 99.9% uptime (8.76 hours of downtime per year). For FHIR servers that support clinical decision support or real-time alerts, consider 99.95% (4.38 hours per year). Document planned maintenance windows separately from unplanned downtime.
How do I prevent information blocking?
Ensure your FHIR server does not impose unnecessary barriers to data access. Provide reasonable response times, do not require manual approval for automated API access, support standard FHIR search parameters, and do not charge unreasonable fees. Document any access restrictions with clinical or security justification.




