Compliance and continuous delivery aren't in tension

There is a story regulated industries tell about themselves: software for medical devices is dangerous, dangerous things must be controlled, control means slowness, and slowness means quarterly releases at best. Continuous delivery, in this telling, is a Silicon Valley luxury that doesn't survive contact with a notified body. The story is wrong, but it's durable, because every team that has sat through a poorly-run audit has felt the gravity of it.

I have spent roughly a decade shipping regulated software — medical IoT at Medisanté, the Philips Clinical AI App Store, the BrightInsight digital health platform — under combinations of ISO 13485, IEC 62304, MDSAP, GDPR and HIPAA. In all three we shipped frequently. At BrightInsight we shipped most days, with the platform serving over 20 million API calls a day across AWS, Azure and GCP, inside ISO 13485 and IEC 62304. None of it required arguing with our quality and regulatory colleagues. It required treating compliance as a property of the pipeline rather than a phase.

That is the entire post. The rest is mechanics.

What the standards actually require

The first move, which I see teams skip with surprising regularity, is to read the standards instead of inheriting a rumour about them.

IEC 62304 — Medical device software — Software life cycle processes — is the headline standard. People who haven't read it imagine it mandates a waterfall lifecycle and a four-month release cadence. It mandates neither. It requires you to classify your software by safety class (A, B or C), define a documented development plan, manage requirements traceably, design with risk in mind, implement and verify with documented evidence at the unit and integration level, and run a software-problem-resolution process for anything in production. It is silent on whether you do those things on a Tuesday afternoon or once a quarter. The constraint is on evidence and repeatability, not tempo.

ISO 13485 — Medical devices — Quality management systems — sits one layer up. It is the QMS standard, the thing your design history file, your CAPA process and your supplier controls hang off. It is also method-agnostic on cadence. It cares that you can show, on demand, that any change has been planned, designed, verified, validated, reviewed and approved by the right people, with linked artefacts.

MDSAP — Medical Device Single Audit Program — is the audit framework, not a separate standard. It is how regulators from the US, Canada, Brazil, Australia and Japan jointly inspect a manufacturer's QMS in a single audit pass. The companion document spells out exactly what an auditor will ask to see.

GDPR and HIPAA are different again. They govern personal data and protected health information respectively. Neither forbids continuous delivery. Both impose obligations around lawful basis, purpose limitation, breach notification timelines, audit logging and minimum-necessary access. Those are properties your platform either has or doesn't — not modes you enter once a quarter.

Read together, what these regimes forbid is not frequency. It is undocumented, unrepeatable, untestable change. Anything that moves through to a regulated product without traceable evidence of intent, design, verification and approval will fail an audit, whether you shipped it last night or last March. Anything that can produce that evidence on demand will pass, even if it shipped twenty times last week.

That's the gap continuous delivery either falls into or jumps cleanly.

Compliance as a property of the pipeline

The reframe that made everything else work, for me, was the one in my values list: pair compliance with continuous delivery. Not "balance" them. Not "trade" them. Pair them, the way you pair a unit test with the function it exercises. If a regulatory clause says a change must be reviewed by two named roles before reaching production, then the merge into the protected branch is the review gate. The names are in CODEOWNERS. The evidence is the GitHub PR record, archived. If a clause requires risk analysis updates when a software item changes, then a CI job that detects the change enforces an updated risk row before the build artefact can be signed.

The pipeline is the QMS, in the parts that touch software. Not a parallel system that quality runs alongside engineering — the same system, with regulatory and quality engineering colleagues as first-class authors. The QA, RA and quality engineers I have worked with at Medisanté, Philips and BrightInsight were not gatekeepers waiting to say no. They were colleagues who wanted to say yes and needed defensible evidence to do so. When the evidence is generated automatically as a by-product of how the team already works, "yes" becomes the cheap answer.

Here is what that looks like, concretely, across the parts of IEC 62304 and ISO 13485 that engineers actually touch.

Design history file as a generated artefact

A design history file (DHF) under ISO 13485 is the record of how a product came to be. Auditors want to see, for any released version, the path from user needs to design inputs to design outputs to verification and validation. In a traditional waterfall shop, the DHF is a folder on SharePoint that the quality team curates by hand. It is always out of date, always painful at audit time.

In a pipeline-as-QMS world, the DHF is a build artefact. Requirements live in a structured store — Jira with mandatory fields, a markdown repo with schemas, or a dedicated ALM tool — with stable IDs. Design documents, ADRs, pull requests and test cases all cite those IDs. The CI job that produces the release candidate also produces, for that exact commit, a generated DHF snapshot: requirements, design references, test results, sign-offs, traceability matrix, all linked. At audit time you don't reconstruct anything. You point at the artefact for the version under inspection. It is the same pattern as a software bill of materials — generated, not curated — extended to design intent.

Traceability matrix from code, not from spreadsheets

IEC 62304 demands traceability: a requirement is implemented by a software item, the software item is verified by a test, the test executes against the verified build. A spreadsheet maintained by hand is the worst possible implementation of this. It rots within two sprints, and rot in your traceability matrix turns a routine audit into a long week.

The implementation that works is to make the matrix the by-product of conventions the team already follows. Requirement IDs in commit messages and PR titles, enforced by commitlint. Test file naming or annotations that bind a test to a requirement. A CI job that walks the repository, joins the data, and emits a JSON traceability matrix per build. Coverage gaps become CI failures, not audit findings six months later. The matrix cannot drift because nobody maintains it directly.

Software unit verification, as IEC 62304 actually defines it

The phrase that wakes engineers up is software unit verification. IEC 62304 expects you to verify software units according to safety class — at Class B and Class C, with documented test cases, expected results, actual results, pass/fail, and the relationship to the detailed design. Engineers used to TDD often either panic ("we need a separate test suite for compliance?") or shrug ("we have unit tests, surely that's fine?"). Neither response is right.

The right response is to make your existing test suite regulator-grade. Deterministic test data, no time-bombs or flakiness, reproducible from a clean checkout. Tests named after the requirement or software item they exercise. Test reports emitted in a structured format (JUnit XML or richer), archived as build artefacts retained for the device's retention period. Coverage thresholds enforced in CI per safety class. The unit test you would already write for engineering reasons becomes the unit verification record for compliance reasons, provided you treat the test report as evidence rather than ephemera.

The change at Medisanté, under IEC 62304 Class C — the highest safety class for medical device software — was largely cultural. We did not write more tests than a comparable non-regulated team. We wrote tests we were prepared to defend. Determinism was non-negotiable. A flaky test under Class C isn't a minor annoyance; it's a corruption of the evidence chain.

Risk management as a CI gate

ISO 14971 (medical device risk management) sits next to IEC 62304, and the link between them is where many teams quietly fail. A change to a software item can change the risk profile of the device. The standard expects that connection to be reasoned about, not assumed away.

Concretely: tag the software items that carry identified risks. The risk register is a structured store with stable IDs. A CI job on every PR checks whether any flagged item is touched, and if so, blocks the merge until the corresponding risk row has been reviewed and acceptance updated. Policy-as-code applied to clinical risk. The engineer doesn't need to remember; the pipeline refuses to let them not do it. The list of flagged items, the criteria for "update required", the right reviewer — those are quality decisions encoded in a config file that engineering and quality jointly own.

Audit-grade CI logs

The thing that most often goes wrong at audit isn't the technology. It's the chain of custody on the evidence. An auditor asks for the test report for version 4.7.2. You produce a PDF. The auditor asks how you know it was the test report for that exact build. The team looks at each other.

The solution is the one we already apply to artefacts: immutability and integrity. Build outputs — test reports, coverage reports, generated DHF, traceability matrix, SBOM, container images — are written to an immutable store keyed by commit SHA and version, with checksums recorded in the release record. CI logs are retained for the regulatory retention period (most platforms support this but it needs explicit configuration). Sign your container images. Sign your release artefacts. Make the chain from commit to evidence to deployed artefact cryptographically inspectable, not merely plausible.

How the standards encode in the pipeline

Pulling this together, here is the rough shape of how the regulated standards I have worked with most often map onto pipeline mechanics:

Design plan and lifecycle (IEC 62304 §5.1, ISO 13485 §7.3): repo conventions and CODEOWNERS, plus an ALM/issue tracker with schema-enforced fields. The plan is the configuration, version-controlled.
Requirements analysis and architectural design (IEC 62304 §5.2–5.3): structured requirements store with stable IDs, ADRs in the repo, schema validation in CI.
Software unit implementation and verification (IEC 62304 §5.5): unit tests with deterministic data, tied by naming or annotation to requirement IDs, with coverage thresholds enforced per safety class.
Integration and system testing (IEC 62304 §5.6–5.7): contract testing with Pact across service boundaries, end-to-end test suites with regulator-acceptable reports, archived as build artefacts.
Software release (IEC 62304 §5.8): signed artefacts, generated DHF snapshot, traceability matrix, SBOM, all tied to the commit SHA and the release tag.
Software problem resolution (IEC 62304 §6 and §9): issue tracker with mandatory fields linking defect to risk file, to affected software items, to verification of the fix.
Risk management (ISO 14971 linkage): policy-as-code gate that blocks merges touching flagged items without a corresponding risk review.
Configuration management (IEC 62304 §8): Git, plus immutable artefact storage, plus signed releases.
GDPR / HIPAA technical controls: audit logging as a platform property, access via short-lived credentials, encryption at rest and in transit as default — not features added late but properties enforced by base infrastructure modules.

None of this is exotic. It is the same continuous-delivery practice an unregulated team would benefit from, with the explicit decision to treat the evidence as a product.

Where the friction actually comes from

When a team cannot ship more often than once a quarter under these regimes, the cause is almost never "the standards forbid it". It is usually one of four things, all fixable.

Compliance is being treated as a phase. A "validation phase" or "regulatory sign-off phase" at the end of the cycle is a sure sign the evidence is being assembled by hand at the wrong moment. Fix: move the evidence generation into CI and treat it as a continuous activity.

The quality team is not in the room. If RA, QA and quality engineering colleagues see the pipeline for the first time at audit prep, the pipeline is missing their requirements. Fix: bring them into the design of the pipeline. Their constraints shape the gates the same way underwriters' constraints at BrightInsight shaped the rules DSL we built for them. Specialists, partnered with.

Test data is non-deterministic. Flaky tests, network-dependent fixtures, ambient state. Anything that makes the test suite non-reproducible is poison to a Class B or Class C evidence chain. Fix: invest in test data and isolation until the suite is deterministic. This is not optional under regulation; it just happens to also be excellent engineering.

Approvals are theatre. A change advisory board that meets weekly to rubber-stamp PRs the engineers have already merged on dev is theatre, and an auditor will notice. Fix: make the approvals real, make them in-band (CODEOWNERS, mandatory reviewers, signed approvals), and reduce the policy until the approvals you keep are ones a human actually adds value to.

The instinct, when audits hurt, is to add ceremony. Almost always the right move is the opposite: remove ceremony, and replace it with automation that produces the same evidence as a by-product of how the team already works.

What this unlocks

The teams I have seen do this well — the Medisanté team that took an event-driven microservices refactor to production under IEC 62304 Class C, the Philips team that delivered an air-gapped Clinical AI App Store across hospitals in Germany, India and the UK under ISO 13485 and IEC 62304, the BrightInsight platform team shipping over 20 million API calls a day across three clouds — were not exotic. They were teams that decided early that the QMS lived in the pipeline, partnered properly with regulatory and quality colleagues, and invested in evidence-generation the way an unregulated team would invest in observability.

The payoff is not just faster releases, though you do get those. It is that the audit becomes uninteresting. Regulators don't enjoy surprise any more than engineers do. When the evidence is identical for every release, the chain of custody is mechanical, and the traceability matrix is generated rather than curated, the conversation moves from "can you prove this?" to "we can see how you proved this; what did you learn?". That is the conversation worth having.

So no, compliance and continuous delivery are not in tension. They are in tension only when you let compliance live as a phase, outside the pipeline, owned by a separate team, assembled by hand at the end. Move it inside, share ownership with the specialists who care about it most, and the standards stop being a brake. They become, in the most literal sense, the spec.

— Madu