Introduction to Practical Data Quality Techniques with Dan Myers

Key Takeaways

  • Standardisation of Data Quality: The webinar introduces the “Conformed Dimensions of Data Quality,” a framework by DQ Matters for universally measuring data fitness across all industries.
  • Hierarchical Framework Structure: The methodology features a strict hierarchy of 11 core dimensions, 37 underlying concepts, and over 70 specific metrics for measurement.
  • Levels of Measurement: The framework categorises metrics into inherent (observable data structure), testable (needing external context), and subjective (relying on human intuition).
  • Resolving Ambiguity: The framework aims to clarify conflicting industry definitions of foundational data concepts, distinguishing between terms like “integrity” and “validity.”
  • The Impact of Artificial Intelligence: AI is shifting data quality priorities by emphasising the importance of dimensions like “existence” and “real-world agreement” to prevent hallucinations, while security and privacy remain crucial.
  • Probabilistic vs. Deterministic Data: Traditional database design demands absolute accuracy, while AI operates probabilistically, prompting practitioners to assess whether complete certainty or “mostly correct” is adequate for their use case.
  • Practical Application Tools: The upcoming book offers practical tools such as the “ABC Company” mock enterprise database for running SQL queries and the Open Data Quality Repository for monitoring data quality over time.

Webinar Details

Title: Introduction to Practical Data Quality Techniques with Dan Myers
Date: 2026-03-04
Presenter: Howard Diesel
Meetup Group: INs and OUTs of Data Modelling
Write-up Author: Howard Diesel

What are the Conformed Dimensions of Data Quality?

The webinar opens with Howard Diesel introducing Dan Myers and asking about their use of the Conformed Dimensions of Data Quality, developed by DQ Matters. These dimensions were previously featured in the DMBOK 2 but were omitted in a later edition, prompting a renewed initiative to educate data practitioners. Dan Meyer notes that his foundational work on the dimensions began in 2013, with formal publication in 2016.

To monitor industry trends, a dimensional data quality survey has been conducted over the past decade. The upcoming 2026 survey will focus specifically on the intersection of artificial intelligence and data quality. Furthermore, Dan announces an upcoming book and comprehensive training sessions designed for data quality professionals.

Figure 1 ‘Introduction to Data Quality Techniques Using the Conformed Dimensions of Data Quality’

What is the Upcoming Data Quality Training?

Dan is the principal founder of DQ Matters, an organisation specialising in data strategy, training, and consulting. He currently serves on the DAMA committee tasked with rewriting the DMBOK, aiming to better integrate data quality and governance with other core knowledge areas. Dan outlines plans for highly practical, customisable data quality training.

This training will utilise the “ABC Company” data model, a hypothetical enterprise database populated with real-world data, allowing participants to perform practical profiling, data model reviews, and SQL queries to identify real-world data quality issues. This hands-on approach aims to elevate professional capabilities and broaden CDMP profiles. Finally, the agenda outlines the core dimensions, the upcoming book, and an open data quality repository.

Figure 2 Speakers Bio

Figure 3 Presentation’s Agenda

What are the Dimensions of Data Quality?

The Conformed Dimensions of Data Quality are defined as categories utilised to characterise data and evaluate its overall fitness for use. These dimensions serve as objective characteristics applicable across industries for assessing, measuring, tracking, and communicating data quality.

A frequent misconception is that data quality dimensions must be industry-specific; however, the dimensions inherently describe the data itself, not the business domain. While reporting dashboards may be domain-specific, the underlying data characteristics remain universal. Furthermore, poor data quality is typically a multifaceted issue stemming from flawed business processes, necessitating a diverse, multidimensional approach to effectively resolve the root causes.

Figure 4 What are the Dimensions of Data Quality?

What is the Problem Statement, and can you map the Concepts?

A significant challenge within the data profession is the lack of standardisation regarding fundamental terms such as “integrity” and “validity”. Dan initiated the mapping of these concepts while studying for the IQCP certification, noting that various authors provided conflicting definitions. To address this, he developed a matrix to align these key concepts.

For example, while some associate “integrity” with unique identifiers and referential integrity, others categorise a “domain of values” under integrity or accuracy. When logging data quality issues in IT systems, practitioners must categorise them by dimension to provide structured metadata. Understanding the distinct role of each concept is critical; referential integrity, for instance, is merely the mechanical methodology employed to enforce data validity.

Figure 5 The Problem Statement

What are the Origins and Types of Dimensions?

The Conformed Dimensions framework is a meticulously researched synthesis derived from six leading authors and standards, including Tom Redman, David Loshin, and MIT’s Wang and Strong. The primary objective is to facilitate objective, quantifiable measurement of data quality rather than subjective evaluation. The framework categorises metrics into three distinct levels.

The most desirable is “inherent” data quality, which can be measured through logical observations of the data structure, such as nullability. The second level is “testable” data quality, requiring external context or baselines, such as an expected record count, to verify completeness. The final level consists of “subjective” metrics, such as reasonability, which necessitate human intuition and are challenging to automate.

Figure 6 Comparison of Six plus Author’s Set of Dimensions

Figure 7 Desirablity of Methods of Measurement

Figure 8 Principles of the Conformed Dimensions

What are Framework Structure, Metrics, and Rules?

The efficacy of the Conformed Dimensions framework resides in its hierarchical structure. At the apex are 11 scientifically determined core dimensions. Beneath these sit 37 atomic “underlying concepts” that serve as the fundamental building blocks of data quality. These concepts are measured using over 70 out-of-the-box metrics, which can be assembled to formulate specific data quality rules.

For example, a rule to verify source-to-target accuracy might employ literal row counts or percentage-based counts. The decision to utilise a raw count versus a percentage depends heavily on the data’s financial or business materiality. This comprehensive standard empowers organisations to objectively compare data quality across departments or entire industries.

Figure 9 The Power of the Conformed Dimensions Framework

Figure 10 Components of Data Quality

What is the Impact of AI on Data Quality

Artificial Intelligence fundamentally impacts data quality assessment. Because AI excels at pattern analysis across vast datasets, traditional dimensions like completeness and consistency may become less critical, as large language models (LLMs) can extrapolate missing data computationally. Conversely, dimensions such as “existence” and “agree with the real world” are increasingly vital.

Because LLMs lack physical sensors to verify real-world conditions, unverified data can lead to unchecked hallucinations. Security and privacy dimensions also remain essential for establishing strict access boundaries for AI agents. The relevance of timeliness and metadata availability remains a subject of debate; however, robust metadata is often necessary for effective prompt engineering to yield high-value outputs.

Figure 11 What does it look like with AI?

How will AI affect Data Quality?

Historically, database design relied on deterministic, Boolean rules (such as the ACID principles) that require absolute correctness. In contrast, AI evaluates data quality probabilistically, treating accuracy as a trend or an “eventual consistency”.

This introduces a critical consideration for practitioners: whether data must be absolutely correct or if being “mostly correct” suffices. The required level of strictness depends on the specific AI technique deployed; for instance, strict deterministic checks are necessary when filtering toxicity in LLM responses before they reach users.

Figure 12 What does it look like with AI? pt.2

What Data Quality affects Daily Activities, and How a Mock Company can Help?

Dan then provides an overview of his upcoming book, which functions as a practical guide for applying the Conformed Dimensions. He examines how data quality affects daily activities, detailing all 11 dimensions and 37 underlying concepts, and provides practical techniques for improvement. It aims to serve both corporate professionals and academic institutions.

‘Data Quality Techniques’ includes the “ABC Company” mock enterprise database, enabling readers to execute SQL queries and profile relational data practically. Additionally, Meyer introduces the Open Data Quality Repository, a tool for consolidating and tracking data quality reporting over time, available for free educational use.

Figure 13 What’s new with the DQ Techniques Book?

Figure 14 What’s new with the DQ Techniques Book? pt.2

If you would like to join the discussion, please visit our community platform, the Data Professional Expedition.

Additionally, if you would like to watch the edited video on our YouTube please click here.

If you would like to be a guest speaker on a future webinar, kindly contact Debbie (social@modelwaresystems.com)

Don’t forget to join our exciting LinkedIn and Meetup data communities not to miss out!

Scroll to Top