FHIR in Production Is Nothing Like FHIR in the Spec
After processing 50,000+ daily FHIR requests at Cerner, I can tell you: the spec describes an ideal world. Production describes the one where two hospitals encode blood pressure differently and DocumentReference has 47 optional fields.
The Gap Between Paper and Production
I spent two years at Cerner (now Oracle Health) building FHIR-compliant APIs and a clinical document retrieval system used by actual doctors in actual hospitals. FHIR R4 is an elegant specification. Clean resource types, RESTful endpoints, JSON payloads. On paper, it solves healthcare interoperability.
In production, it introduces a new class of problems that the spec authors — brilliant as they are — couldn’t predict. Because the spec describes syntax. Production requires semantics. And semantics are where healthcare data gets ugly.
Problem 1: Semantic Interoperability Is a Fantasy
FHIR defines an Observation resource for clinical measurements. Blood pressure is an Observation. Temperature is an Observation. “Patient seems anxious” is also an Observation.
The spec says you should use LOINC codes to identify what kind of observation it is. LOINC code 85354-9 is “Blood pressure panel.” Straightforward, right?
Hospital A sends blood pressure as a single Observation with two components (systolic and diastolic). Hospital B sends it as two separate Observations linked by a shared encounter reference. Hospital C sends it as a DiagnosticReport containing an Observation panel. All three are FHIR-compliant.
// Hospital A: Component-based (correct per spec)
{
resourceType: "Observation",
code: { coding: [{ system: "http://loinc.org", code: "85354-9" }] },
component: [
{ code: { coding: [{ code: "8480-6" }] }, valueQuantity: { value: 120 } },
{ code: { coding: [{ code: "8462-4" }] }, valueQuantity: { value: 80 } }
]
}
// Hospital B: Separate observations (also valid)
// Two resources, each with code 8480-6 and 8462-4 respectively
// Linked only by encounter reference
// Hospital C: Wrapped in DiagnosticReport
// The blood pressure lives three levels deep
Our normalization layer had 2,400 lines of code just for vital signs. Not because the logic was complex — because the variation was infinite.
Problem 2: DocumentReference Is a Nightmare
FHIR’s DocumentReference resource is how you exchange clinical documents — discharge summaries, lab reports, imaging results. It has 47 optional fields. In production, every vendor fills in a different subset.
The content.attachment field can contain:
- Inline base64-encoded data (small documents)
- A URL pointing to a FHIR Binary resource (standard approach)
- A URL pointing to a vendor-specific document API (common, non-standard)
- A URL pointing to a PDF on an S3 bucket (I’ve seen this)
- Nothing, with the actual content in an extension field (I’ve also seen this)
We built a document retrieval service that handled all five patterns. The fallback chain:
async function resolveDocumentContent(ref: DocumentReference): Promise<Buffer> {
const attachment = ref.content?.[0]?.attachment;
// Pattern 1: Inline data
if (attachment?.data) {
return Buffer.from(attachment.data, 'base64');
}
// Pattern 2: FHIR Binary URL
if (attachment?.url?.includes('/Binary/')) {
return fetchFHIRBinary(attachment.url);
}
// Pattern 3: Extension-based content
const extContent = ref.extension?.find(
e => e.url === 'http://vendor-specific/document-content'
);
if (extContent?.valueAttachment?.data) {
return Buffer.from(extContent.valueAttachment.data, 'base64');
}
// Pattern 4: Direct URL (vendor API or S3)
if (attachment?.url) {
return fetchWithRetry(attachment.url, { timeout: 30000 });
}
throw new DocumentResolutionError(ref.id, 'No resolvable content path');
}
This function processed 50,000+ requests daily. The last pattern — the direct URL — accounted for 12% of requests and had a 3% failure rate due to expired presigned URLs and vendor API outages.
Problem 3: SMART on FHIR Authentication in Practice
The SMART on FHIR authorization framework is well-specified. In theory: OAuth 2.0 with scoped access tokens that limit what resources and operations a client can access.
In practice: every EHR vendor implements the token endpoint slightly differently. Epic’s token response includes custom claims. Cerner’s refresh token behavior differs from the spec in edge cases. Allscripts doesn’t support launch/patient scopes the way the spec describes.
We maintained a vendor-specific authentication adapter layer. Each adapter implemented the same interface but handled vendor quirks internally. The adapter for one vendor had a comment that read: “Their token endpoint returns 200 with an error body instead of 401. Yes, really.”
Problem 4: Performance at 50K Requests/Day
FHIR’s RESTful design means a lot of requests for a single clinical workflow. Displaying a patient’s chart requires:
GET /Patient/{id}— demographicsGET /Encounter?patient={id}&_sort=-date— recent encountersGET /Condition?patient={id}&clinical-status=active— active problemsGET /MedicationRequest?patient={id}&status=active— current medicationsGET /DocumentReference?patient={id}&_sort=-date&_count=10— recent documents
Five requests for one page load. Multiply by 10,000 clinicians checking charts throughout the day.
We solved this with a pre-assembled patient summary that denormalized the most common query pattern into a single document. Updated via FHIR Subscription notifications when any referenced resource changed. This cut the per-chart-load from 5 API calls to 1, with a 99th percentile latency of 180ms.
The spec doesn’t describe this pattern. You discover it after your database connection pool exhausts for the third time in a month.
What I’d Tell Someone Starting FHIR Work
-
Don’t trust the spec examples. They show the happy path. Production data has null fields where the spec says required, extensions where the spec has standard fields, and coding systems you’ve never heard of.
-
Build a normalization layer early. You will receive the same clinical concept encoded 5 different ways. Normalize on ingest, not on read.
-
Test with real vendor data, not synthetics. Synthea generates beautiful, spec-compliant FHIR data. Real hospital data does not look like Synthea.
-
Budget 3x the time you think for authentication. SMART on FHIR is straightforward in the spec. Every vendor implementation has quirks.
-
Invest in observability. When a DocumentReference fails to resolve, you need to know which vendor, which pattern, and which specific URL timed out. We logged every resolution attempt with latency, source pattern, and outcome. This data drove our reliability improvements more than any code change.
FHIR is the right approach to healthcare interoperability. It’s just not the simple approach.