Bridge the DQ Dimensions Theory-Practice Gap Using the Open DQ Repository with Dan Myers

Key Takeaways

  • Standardised Data Quality Framework: The conformed dimensions framework creates a shared vocabulary for technical teams and business stakeholders.
  • The ABC Company Model: A free synthetic relational database for education, with 37 intentional data quality errors included.
  • Open Data Quality Repository: This SQL database standardises how organisations store data quality assessment results and metrics.
  • Vendor Tool Consolidation: The repository aggregates data quality metadata from various tools into a unified enterprise system.
  • Cost-Effective ‘Bring Your Own BI’ Architecture: The repository excludes a user interface, relying on existing BI tools for reporting and visualisation.
  • Contextual Application via Custom Thresholds: Departments can set custom thresholds within universal data quality metrics for specific business contexts.
  • Root Cause Remediation: Resolve data defects by fixing business processes, not just through downstream data cleansing.
  • Three-Part Educational Structure: The publication covers principles, dimensions, concepts, and techniques for data remediation using the ABC model.

Webinar Details

Title: Bridge the DQ Dimensions Theory-Practice Gap Using the Open DQ Repository with Dan Myers
Date: 2026-04-14
Presenter: Dan Myers
Meetup Group: Book Launch with Technics Pub x MWS
Write-up Author: Howard Diesel

What are the Conformed Dimensions in Data Quality?

At its core, data quality professionals look for patterns to solve recurring business problems rather than repeatedly putting out the same fires. By utilising conformed dimensions, organisations establish a consistent, normalised vocabulary to bridge the gap between technical teams and non-technical business stakeholders. Interestingly, while these dimensions were initially designed for relational databases, their core principles—such as representation, interpretability, and metadata availability—can also be effectively applied to semi-structured data, such as documents, in the age of AI.

Figure 1 What Are the Dimensions of Data Quality?

Figure 2 The Problem Statement

Figure 3 What Does the Conformed Dimensions Framework Look Like?

What are the Innovations in the Data Framework?

A common critique of the conformed dimensions framework is that it can feel too academic. To make it highly practical, Dan introduces two major innovations: the Open Data Quality Repository and the ABC Company data model. The ABC Company is a free-to-use, synthetic relational schema—think of it as the modern equivalent of the classic Northwind database.

It features realistic products, services, customers, and invoices, complete with at least 37 intentional data quality errors. This provides an excellent sandbox for practitioners to test their skills. To complement this, the Open Data Quality Repository acts as the physical database where professionals can actually store the results of their data quality assessments using standard SQL.

Figure 4 Explanation of Fitness for Use in Your Organisation

Figure 5 Open Data Quality Repository Core Concept

What are the Sections of Dan’s Upcoming Book?

These innovations tie perfectly into Dan’s upcoming book, which is divided into three comprehensive sections. The first section provides a foundational introduction to data quality and compares various industry frameworks. The second section dives deep into the 11 conformed dimensions and their 37 underlying concepts, grounding each in real-world examples using the ABC Company model.

Finally, the third section explores practical techniques for remediation and prevention of data quality issues. These techniques range from proactive measures, like using drop-down menus to ensure validity, to reactive profiling. Ultimately, learning the conceptual framework is just the beginning; the true value comes from physically implementing these metrics into the Open Data Quality Repository.

Why adopt the Open Data Quality Repository?

Why should your organisation adopt the Open Data Quality Repository? This framework is comprehensive, defensible, and thoroughly documented. It codifies years of data quality research into a ready-to-use SQL database structure. Out of the box, it provides a denormalised table that links dimensions to their metrics, which you can integrate directly into your own schemas.

The repository aims to eliminate the classic “fist fights” over data definitions by giving teams a concrete starting point. Economically, the repository is designed with low barriers to entry: it is completely free for individual practitioners to use for educational purposes, while requiring a licensing fee for corporate, enterprise deployments.

Can AI Automatically Cleanse Bad Data from Repositories?

The introduction of the repository sparked lively audience discussion, particularly regarding AI and data cleansing. One attendee wondered if an AI agent could connect to the repository to automatically cleanse bad data. While Dan acknowledges this as a likely future use case, his current philosophy heavily favours fixing problems at their source rather than relying solely on downstream cleansing. He argues that true data quality requires understanding and correcting the flawed business processes that generate bad data in the first place. However, audience members noted that in certain scenarios—such as migrating from legacy systems lacking data entry controls—downstream cleansing remains an absolute necessity.

Figure 6 The Problem – the Opportunity

How can Organisations Measure Data Quality Consistently?

A major hurdle for organisations is measuring data quality consistently without relying on overly complex, proprietary tools. The Open Data Quality Repository solves this by providing a standardised hierarchy: Dimensions -> Underlying Concepts -> Metrics. For example, the dimension of “consistency” includes underlying concepts such as “equivalence of data,” which in turn generate specific metrics calculated as counts or percentages. Because a single data defect (such as a support ticket) often affects multiple dimensions and concepts simultaneously, this clear hierarchy allows teams to precisely define issues and prioritise remediation efforts based on business needs.

Figure 7 The Opportunity

Figure 8 1. Standardisation

How does the Repository Break Down Vendor Silos?

One of the most powerful features of the Open Data Quality Repository is its ability to break down vendor silos. Often, different departments use proprietary profiling tools—such as Informatica, Collibra, or dbt—that don’t communicate with one another. The repository acts as an open, centralised consolidation layer, allowing organisations to feed data quality results from any tool into a single location for enterprise-wide reporting. It provides pre-built dimensions, concepts, and rules out of the box that users can apply their own custom business thresholds to. Rather than replacing existing profiling tools, it serves as the ultimate system of record for an organisation’s data quality metadata.

Figure 9 2. Pre-built

Figure 10 3. Open Architecture

What Architecture Supports Custom Error Types for Organisations?

Diving into the architecture, the repository is built on standard ANSI SQL and consists of several key table groups. It includes core tables for dimensions and concepts, allowing users to create their own custom error types tailored to their organisation. There are tables for data quality rules, metrics, cross-references, and test results, which store the transactional outputs of your profiling tools.

The ABC Company model serves as the Operational Data Store (ODS), featuring realistic inventory, invoice, and customer data. Crucially, the repository operates on a “bring your own BI” model; by omitting a proprietary UI, it significantly reduces costs, allowing you to use tools like SAP or Qlik for visualisation.

Figure 11 The DQ Repo Data Model

Figure 12  The DQ Repo Data Model – Focus Areas

Figure 13 The ABC Company Data Model for Training

Figure 14 The ABC Company Data Model for Training – Focus Areas

Figure 15 Lightweight Architecture

Figure 16 Scalable Approach

How does the Framework Handle Varying Data Requirements?

Dan concluded with an insightful Q&A that touched on how the repository handles context. When asked how the framework accommodates “fit for purpose” requirements—such as varying precision needs between marketing and healthcare—Dan explained that context is applied through custom thresholds on the data quality rules. The baseline metrics remain universal, but different departments can set different acceptable limits for the exact same attribute.

Looking forward, if the industry widely adopts this conformant schema, it could eventually enable cross-company benchmarking to determine realistic standards for specific data elements. The webinar wrapped up with promises of a deeper dive in a future session.

Figure 17 Why Data Quality Matters

Figure 18 Call to Action

If you would like to join the discussion, please visit our community platform, the Data Professional Expedition.

Additionally, if you would like to watch the edited video on our YouTube please click here.

If you would like to be a guest speaker on a future webinar, kindly contact Debbie (social@modelwaresystems.com)

Don’t forget to join our exciting LinkedIn and Meetup data communities not to miss out!

Scroll to Top