Engineering Reliable Autonomy with Howard Diesel & Dale Rutherford

Key Takeaways

Evolution of DMAIC for AI: Six Sigma and DMAIC require adaptation to address stochastic behaviours in agentic AI systems.
Continuous Model Decay: AI models degrade over time due to “model autophagy,” necessitating continuous behavioural assurance and governance.
Classical Statistical Failures: Traditional SPC fails with AI due to non-normal data distributions and evolving processes.
Innovative Metrics and Frameworks: New mathematical concepts are needed for AI governance, including ECPI and SPAR metrics.
High-Stakes Operational Risks: AI drift impacts industries differently, with serious consequences in patient triage and insurance claims.
Agentic Engineering over Harness Engineering: Multi-agent architectures require governance to ensure safe interactions and manage systemic risks effectively.
Implementation Strategy and Education: Over 90% of corporate AI initiatives fail due to complexity; start simple and prioritise education.

Webinar Details

Title: Engineering Reliable Autonomy with Howard Diesel & Dale Rutherford
Date: 2026-04-20
Presenter: Howard Diesel & Dale Rutherford
Meetup Group: DAMA SA Big Data
Write-up Author: Howard Diesel

How does AI Affect Traditional DMAIC Frameworks?

The transition to agentic artificial intelligence represents a paradigm shift that significantly challenges established process frameworks such as DMAIC (Define, Measure, Analyse, Improve, Control). During the webinar’s opening, Howard Diesel discussed his analysis of Dr Dale Rutherford’s doctoral research, which utilised a specific framework to tackle the issue of “model autophagy.” This phenomenon occurs when artificial intelligence models degrade over time as they are retrained on their own outputs, independent of any new data being introduced.

To effectively measure and mitigate this continuous model decay, organisations must implement robust testing platforms and testing protocols. Consequently, practitioners are required to adopt, adapt, and innovate classical DMAIC tools to establish behavioural assurance and comprehensive lifecycle governance. This strategic evolution is particularly critical for managing complex agentic workflows in highly regulated environments.

Figure 1 Governing the Snowball Effect

Figure 2 AI Behavioural Assurance

Figure 3 ‘Governing the Snowball Effect’

How does AI Challenge Traditional Six Sigma Methodologies?

Traditional Six Sigma methodologies were engineered for deterministic manufacturing environments, focusing on maintaining stable inputs to ensure observable and repeatable outputs. In contrast, agentic AI introduces stochastic and probabilistic behaviours into operational processes. Rather than managing stationary processes characterised by normal data distributions, organisations must now govern continuously evolving models associated with heavy-tailed data distributions.

Furthermore, while classical manufacturing focused singularly on the “voice of the customer,” AI deployments are subject to rigorous regulatory scrutiny, requiring systems to systematically provide evidence of drift detection. Reversibility presents another profound challenge; although engineers can mathematically reverse an AI model to a prior state, the external operational environment and incoming data streams invariably continue to change, creating significant contextual misalignments.

Figure 4 From Variation to Behaviour

Figure 5 The DNA of Classical Six Sigma

Figure 6 The Structural Rupture

How can we Effectively Improve AI Governance Frameworks?

The direct application of classical statistical tools to AI often misclassifies critical systemic degradation as mere operational noise. Consequently, effective AI governance necessitates the development of novel mathematical frameworks and conceptual innovations. A key adaptation is the transition from the traditional “Voice of Customer” to the “Voice of Governance,” recognising that regulatory mandates regarding data privacy and retention supersede individual user preferences.

Unlike physical manufacturing, AI outputs lack objective physical attributes, frequently necessitating human-in-the-loop scoring and subjective consensus for measurement. To address these complexities in non-stationary environments, the framework introduces new operational constructs, including governance trees, behavioural specification envelopes, and a shift from standard root-cause analysis toward establishing “causal confidence” to manage the partial observability inherent in large language models.

Figure 7 The Core Mismatch

Figure 8 The DMAIC-AI Bridge

Figure 9 Define: from Voice of Customer to Voice of Governance

Figure 10 Measure: from Sigma to Behavioural Reality

Figure 11 Analyse: from Root Cause to Causal Confidence

Figure 12 Improve: from Permanent Fixes to Least-Invasive Control

Figure 13 Control: from Statistical Process Control to Active Governance

Figure 14 Synthesis: the DMAIC-AI Executive Crosswalk

What are the Limitations of Classical SPC Methods?

Classical Statistical Process Control (SPC) methods are inadequate for AI due to fundamental statistical failures, primarily “normality failure” and “stationarity failure”. Artificial intelligence systems generate heavy-tailed, non-normal data distributions, rendering classical capabilities metrics based on standard deviations irrelevant. To address this, the revised framework replaces traditional capability calculations with ECPI, utilising empirical quantiles derived directly from operational data. Additionally, systems must utilise adaptive monitoring that dynamically re-baselines statistics following authorised parameter changes.

A critical innovation is the introduction of SPAR, a process yield metric designed to monitor sequential AI operations. Because failure rates compound exponentially with each sequential processing step—creating a compounding degradation effect—SPAR is essential for tracking adherence and preventing autonomous systems from breaching strict regulatory thresholds.

Figure 15 From Variation Control to Behavioural Governance

Figure 16 Monitoring the Unpredictable

Figure 17 SPC Failures and BME Remedy in LLMs

Figure 18 Adaptive CUSUM

Figure 19 The SPAR Equation

How can Organisations Ensure Equitable Treatment in Claims?

The practical application of this governance framework is illustrated through an insurance triage use case in which an enterprise processes large volumes of short-term claims daily. Artificial intelligence agents analyse unstructured data to determine coverage, categorise fraud risk, and recommend handling pathways. However, without rigorous oversight, autonomous routing decisions create significant operational and litigation risks under frameworks such as the EU AI Act and “Treat Customers Fairly” mandates.

To ensure equitable treatment, organisations must implement an “input class registry” to strategically categorise claim complexity. The DMAIC AI model systematically monitors the processing of unstructured data and generates automated alerts when deviations occur. For example, the system can detect behavioural drift due to an out-of-date corpus, enabling administrators to update the model and re-baseline performance metrics before regulatory violations occur.

Figure 20 The Operational Reality of AI-Assisted FNOL

Figure 21 The Crucial Distinction

Figure 22 The Regulatory Crucible

Figure 23 The v1.1 Paradigm Shift

Figure 24 Innovation 1: the Input Class Registry

Figure 25 Insurance Claims DMAIC-AI Demo

Figure 26 Before DMAIC Begins

Figure 27 Establishing the Governance Charter

Figure 28 Validating the Measurement System

Figure 29 Diagnosing Root Causes

Figure 30 Validating Interventions

Figure 31 Live SPC Monitoring Simulation

Figure 32 SPAR by Input Class

How does AI Governance Prevent Healthcare Failures?

In high-stakes healthcare environments, such as an Emergency Department processing numerous daily patients, the consequences of AI failures are profoundly magnified. Unlike financial errors in insurance that can be administratively reversed, an AI “under-triage” error in a clinical setting can directly result in patient mortality.

To mitigate these critical risks, the DMAIC AI framework implements a sophisticated three-tier governance hierarchy that includes explicit limits on the specification of critical characteristics. During a clinical simulation, the statistical tracking system successfully identified a slow, gradual performance drift approaching the lower control limit, which automatically triggered a timeout advisory. This proactive escalation enabled operational administrators to identify data accuracy as the root cause and implement corrective measures before a sentinel event.

Figure 33 Diagnosing the Patient Safety Problem

Figure 34 Governance Charter and Behavioural Specifications

Figure 35 Baseline Statistics

Figure 36 AI-FMEA: Risk Register

Figure 37 Phase Four: Improve

Figure 38 Live SPC Monitoring Simulation

Figure 39 SPAR by Input Class

Is Agentic Engineering Necessary for Managing Autonomous Agents?

As organisations transition toward workflows involving multiple autonomous agents, the systemic risk of cascading errors increases exponentially. Consequently, relying solely on “harness engineering”—the optimisation of individual, isolated models—is insufficient for enterprise deployments. Effective governance demands comprehensive “agentic engineering” to mathematically bind and systematically control autonomous behaviour over time.

This is operationalised through a specialised architecture that incorporates critical components such as “Symprompt+” for operational voice and governed interventions, alongside a metric suite that manages performance memory. Because engineers cannot rely on traditional mechanical constraints, they must leverage advanced techniques such as prompt engineering, context constraint injection, and Failure Mode and Effects Analysis (FMEA) to pre-emptively address risks before requiring intensive fine-tuning.

Figure 40 The AI Agent Benchmark Lie

Figure 41 The Snowball Effect in Agentic Workflows

Figure 42 Harness Vs. Agentic Engineering

Figure 43 The ALAGF Architecture (Adaptive Lifecycle Agentic Governance Framework)

Figure 44 Redefining Quality: the BME Metric Suite

Figure 45 Define: The Behavioural Specification Envelope

Figure 46 Analyse: AI-FMEA and Amplified Risk

Figure 47 Improve: The Intervention Taxonomy

Figure 48 The Economics of Governance: AI COPQ

How can Organisations Ensure Successful AI Implementation?

The strategic objective of enterprise artificial intelligence is the successful orchestration of multi-agent environments, which introduces unprecedented operational complexity. Governing these intricate systems requires robust solutions for persistent episodic memory management, cross-agent authority escalations, and context rollbacks to maintain functional state across multiple interacting models. However, the current implementation landscape is highly challenging; a vast majority of corporate AI initiatives fail because organisations attempt to deploy complex, cross-departmental solutions without adequate foundational readiness.

To ensure sustainable success, practitioners must establish basic minimum viable products and systematically mature their models through iterative enhancements. Concurrently, significant investments must be made in practitioner education to ensure that human-in-the-loop operators can effectively comprehend and manage sophisticated anomaly detections.

Figure 49 The Benchmark Lie: Why Behavioural Assurance is Required

Figure 50 The Paradigm Shift: Defining Agentic Engineering

Figure 51 The ALAGF v1 Architectural Triad

Figure 52 The Boundary: The Multi-Agent Orchestration Gap

Figure 53 Where ALAGF Goes Next

Figure 54 Executive Takeaway: The Trajectory of Governance

Figure 55 ‘AI Behavioural Assurance’ by Dale Rutherford

What are Effective Frameworks for Scaling Agentic AI?

Howard then concluded the webinar by underscoring the vital, ongoing academic and industrial research necessary to perfect these advanced governance frameworks. Collaborative researchers, such as Dr Wu, are actively recruiting academic talent to address the unresolved challenges inherent in continuous multi-agent orchestration environments.

For experienced data management professionals, the rapid, highly complex adoption of agentic AI evokes a distinct sense of historical repetition, bearing strong similarities to the explosive growth of Big Data initiatives in 2009. Ultimately, to safely and effectively scale agentic workflows across enterprise operations, organisations must implement robust structural frameworks, such as DMAIC AI, to ensure uncompromised safety, behavioural fairness, and rigorous regulatory compliance.

If you would like to watch the edited video on our YouTube channel, please click here.

Table Of Contents

Key Takeaways
How does AI Affect Traditional DMAIC Frameworks?
How does AI Challenge Traditional Six Sigma Methodologies?
How can we Effectively Improve AI Governance Frameworks?
What are the Limitations of Classical SPC Methods?
How can Organisations Ensure Equitable Treatment in Claims?
How does AI Governance Prevent Healthcare Failures?
Is Agentic Engineering Necessary for Managing Autonomous Agents?
How can Organisations Ensure Successful AI Implementation?
What are Effective Frameworks for Scaling Agentic AI?