What Is FMEA? A Practical Guide for Quality Engineers and Compliance Teams

What Is FMEA? A Practical Guide for Quality Engineers and Compliance Teams

FMEA (Failure Mode and Effects Analysis) is a structured, team-based method for identifying ways a product, process, or system can fail, understanding the consequences of each failure, and deciding which failures warrant action before they reach customers or regulators.

The method originated in the U.S. military in 1949 under MIL-P-1629, moved into NASA in the 1960s during the Apollo program, entered automotive production lines in the 1970s through Ford, and now sits at the center of quality and risk management requirements across FDA-regulated industries, automotive supply chains, and IEC-governed electrical systems.

For quality engineers and compliance teams, FMEA answers a specific question: where in this design or process is the risk most concentrated, and what are we going to do about it?

The Three Types of FMEA

The FMEA framework applies to different points in the product and process lifecycle. Understanding which type you need matters before you start.

Design FMEA (DFMEA) examines a product design before it goes to manufacturing. The team asks what could go wrong with each component, subassembly, or system function, and whether those failures would harm the end user, cause the device to malfunction, or create a regulatory compliance issue.

Process FMEA (PFMEA) shifts the analysis to manufacturing and assembly operations. Here, the failure modes involve process steps — incorrect torque, contaminated materials, misconfigured equipment — rather than component failures. PFMEA is where most production quality teams spend their time.

System FMEA takes a higher-level view, examining how subsystems interact and where system-level failures emerge from combinations of individual components that each function normally in isolation but produce unexpected behavior together.

Medical device manufacturers typically run DFMEA during design and development, PFMEA during process validation, and sometimes both together when design and manufacturing choices interact closely.

How Severity, Occurrence, and Detection Ratings Work

Every FMEA assigns three scores to each failure mode.

Severity (S) rates how bad the consequence would be if the failure occurred, on a scale of 1 to 10. A severity of 1 means the failure is barely noticeable. A severity of 9 or 10 means the failure causes patient harm, a regulatory violation, or loss of life. Severity addresses effect — it says nothing about likelihood.

Occurrence (O) estimates how often the failure mode is expected to happen, also on a 1-to-10 scale. Teams base this on historical data, process capability indices, or engineering judgment. An occurrence of 1 means the failure is almost impossible. A 9 or 10 means failures are likely to occur repeatedly in production.

Detection (D) rates how likely it is that existing controls will catch the failure before it reaches the customer. A low detection score (1-2) means your current controls will almost certainly catch this failure. A high detection score (8-10) means the failure will likely go undetected.

In the classic approach, these three numbers multiply together to produce a Risk Priority Number (RPN): RPN = S x O x D. RPNs range from 1 to 1,000. Teams set threshold values — often around 100-125 — and assign corrective actions to any failure mode above the threshold.

RPN vs. Action Priority: The AIAG-VDA 2019 Shift

The RPN method has a well-known flaw. A failure mode with a Severity of 10, an Occurrence of 1, and a Detection of 1 produces an RPN of just 10. Under classic FMEA, that failure mode might receive no action. But a severity of 10 means catastrophic harm. The RPN calculation can obscure the most dangerous failure modes by treating all three factors as equals.

The 2019 AIAG-VDA FMEA Handbook, produced jointly by the Automotive Industry Action Group and Verband der Automobilindustrie, addresses this directly by replacing RPN with Action Priority (AP). AP uses a structured lookup table in which Severity always comes first. Any failure mode with a Severity of 9 or 10 gets an Action Priority of High regardless of its Occurrence and Detection scores. The table then uses Occurrence and Detection as secondary modifiers to determine whether a lower-severity failure mode is High, Medium, or Low priority.

IEC 60812:2018, the general-purpose FMEA standard published by the International Electrotechnical Commission, still recommends the RPN approach but provides detailed guidance on how to score each dimension consistently and how to interpret RPN results in context. Teams working in electrical systems, medical devices outside the automotive supply chain, and general manufacturing most commonly reference IEC 60812:2018 for process structure.

For teams that still use RPN: the number alone never drives the decision. What matters is whether a failure mode warrants action, and a severity of 9-10 always does — regardless of what the multiplication produces.

ISO 14971 and What FDA Expects from Risk Management

ISO 14971:2019, "Medical Devices — Application of Risk Management to Medical Devices," is the international standard that governs risk management for medical devices. FDA recognizes it as a consensus standard and references it in the Quality Management System Regulation (QMSR).

FMEA fits within ISO 14971 as a risk analysis tool, but ISO 14971 asks for more than an FMEA spreadsheet. The standard requires a complete Risk Management File that covers:

  • A risk management plan defining scope, responsibilities, and review criteria
  • Risk analysis documenting identified hazards and their causes
  • Risk evaluation determining which risks require reduction
  • Risk controls with implementation evidence
  • Evaluation of residual risk after controls are applied
  • A benefit-risk determination for residual risks that cannot be reduced further
  • Post-market surveillance data feeding back into the risk file

An FMEA can satisfy the risk analysis requirement inside ISO 14971, but it cannot substitute for the full file. Teams that submit only an FMEA without a risk management plan and without post-control evaluation regularly generate FDA Form 483 observations during device inspections.

Under the QMSR, which replaced the old 21 CFR Part 820 Quality System Regulation on February 2, 2026, FDA aligned its device quality requirements with ISO 13485:2016. This alignment brings ISO 14971's risk management approach directly into the FDA regulatory framework. Under QMSR, risk management is expected to run throughout the product lifecycle — design, production, post-market monitoring — rather than being treated as a pre-submission checklist item.

FDA inspection findings since the QMSR effective date have consistently flagged companies where risk management files are incomplete, where FMEAs have not been updated after design changes, and where post-market complaint data has not fed back into the risk register. The FDA's risk management guidance documents describe FMEA as appropriate for systematic failure analysis but emphasize that the output must connect to documented control decisions and post-market data loops.

Running an FMEA: The Six-Step Process

Most teams follow a structured process regardless of which standard they are working under.

Step 1: Define the scope. Determine whether this is a DFMEA, PFMEA, or System FMEA, and document what is in and out of scope. For a DFMEA, this typically means a functional block diagram of the device. For a PFMEA, it starts with a detailed process flow diagram.

Step 2: Identify failure modes. For each function or process step, ask what could go wrong. Document every potential failure mode — not just the ones the team considers likely. Teams that filter failure modes at this stage produce incomplete FMEAs that miss the failures that eventually reach customers.

Step 3: Analyze effects and causes. For each failure mode, document the effects on the user or the next process step, and identify the root cause or mechanism that could produce the failure. Vague cause statements like "operator error" or "material variation" do not support actionable controls.

Step 4: Rate Severity, Occurrence, and Detection. Apply the scoring criteria from your applicable standard consistently across the team. Rating calibration sessions at the start of a new FMEA reduce inter-rater variability and produce more defensible scores during audits.

Step 5: Prioritize and assign actions. Use RPN thresholds or the Action Priority table to identify which failure modes need action. Document the specific action, owner, and due date. An action field that reads "monitor" is a deferral, not an action.

Step 6: Verify and update. After implementing controls, re-score Occurrence and Detection. Severity rarely changes after a design fix unless the failure mode itself changed. Document the revised scores. An FMEA that still shows pre-control scores is an incomplete document.

Where FMEAs Break Down in Practice

Most FMEA problems in regulated environments come down to three patterns.

The first is treating the FMEA as a one-time submission artifact. Teams complete the analysis before design freeze, file it, and never return to it. When a change is made 18 months later, the FMEA does not reflect the current design. During a device inspection, investigators ask to see design change records alongside the FMEA. When the two do not match, that becomes an audit finding.

The second is disconnecting the FMEA from deviation CAPA records. When a failure mode identified in the FMEA actually occurs in production, the CAPA that follows should reference the FMEA entry. The root cause investigation should either confirm the FMEA's predicted cause or update it. FMEAs and CAPA systems maintained in separate, unlinked spreadsheets rarely stay synchronized.

The third is poor detection scoring. Teams routinely underestimate detection difficulty, assigning low scores to controls that are actually periodic spot checks or visual inspections. When a failure mode with a detection score of 2 escapes to the field and generates a complaint, investigators ask to see the detection control. What they often find is a quarterly audit that did not run in the quarter the failure occurred.

Managing FMEA in a QMS

Maintaining FMEA documents across design revisions, process changes, and post-market data updates requires a quality system that treats the FMEA as a living document rather than a static attachment.

This means version control for every FMEA revision, documented approvals when scores or actions change, links between FMEA records and design change requests, and automated notifications when a related CAPA opens that matches a documented failure mode.

Teams working in spreadsheets handle this through manual file management, which produces version control gaps and broken links between documents. When the same FMEA analysis lives in an electronic QMS with direct connections to design controls, change management records, and CAPA workflows, the update burden drops and the audit trail is automatic.

Cloudtheapp's FMEA application connects risk analysis directly to design controls, change management, and CAPA records within a single platform. When a design change request opens, the linked FMEA receives a notification for review. When a CAPA references a failure mode from the FMEA, the connection is documented and searchable. To see how this works in a live system, request a demo at cloudtheapp.com/demo/.

The Document That Defines Your Risk Posture

The FMEA table is the output. The analysis is the work, and the analysis requires a cross-functional team, structured facilitation, honest scoring, and a commitment to updating the file when new information arrives.

For quality engineers building or auditing a risk management program, the questions that matter: Does the FMEA reflect the current design? Do its high-priority failure modes have documented controls with implementation evidence? Does your post-market complaint data have an explicit pathway back into the risk file?

A well-maintained FMEA is among the most useful documents in a design file during a regulatory inspection. A stale one is among the most damaging.

Please complete the form to access the Case Study

Please complete the form to access the Case Study

You will receive the webinar link via email once your request has been approved

Sign Up for Cloudtheapp

New to Cloudtheapp?

Access to try Cloudtheapp can be granted after you request a demo to learn how it can transform your operations.

Existing Customer User?

You can proceed with signing up.

New to Cloudtheapp?

Access to try Cloudtheapp can be granted after you request a demo to learn how it can transform your operations.

Existing Customer User?

You can proceed with signing up.

Please complete the form to access the Case Study

Please complete the form to access the Case Study

Please complete the form to access the Case Study

Please complete the form to access the Case Study

Please complete the form to access the Case Study