How to Prepare for the CDMP Data Quality Specialist Exam

CDMP DQ Specialist Exam

Webinar Details

Title: How to Prepare for the CDMP® Data Quality Specialist Exam
Date: 23 August 2023
Presenter: Howard Diesel
Meetup Group:
Write-up Author: Howard Diesel

Data Quality and Business Strategies

During the presentation, Howard emphasised the importance of accurately mapping dimensions and understanding different perspectives. He also highlights the need to understand management operations and statistical approaches thoroughly. Excel was recommended for exam calculations. Data quality activities were discussed, starting with defining high-quality data and developing a business case or strategy to identify business needs, requirements, and data quality expectations. No further details were provided on tools and templates.

CDMP DQ (Data Quality) Exam Structure and Module breakdown

Figure 1 CDMP DQ (Data Quality) Exam Structure and Module breakdown

Data Quality Activities

Figure 2 Data Quality Activities

Importance of Business Case and Data Quality Program in Project Management

Make a business case that outlines the benefits and value to justify pursuing projects. To address data quality problems, first understand the cost. The CIO’s views are crucial. Financial metrics like cash flow and return on investment are important. A business case must address critical questions about the data quality problem, the organisation’s strengths and weaknesses, and the resources and training needed.

DQ Business Case: Strategy

Figure 3 DQ Business Case: Strategy

Business Case Definition

Figure 4 Business Case Definition

Business Case Methodology

Figure 5 Business Case Methodology

Business Case Overview

Figure 6 Business Case Overview

Business Case Cost/Benefit

Figure 7 Business Case Cost/Benefit

The Role of Data Stewards in Defining Data Quality Expectations and Connecting Data Lineage to Business Processes

As a data steward, you are responsible for establishing data quality expectations and identifying issues that may arise in business operations. It is important to connect various elements, such as BI reports, application processes, and data lineage, to ensure accurate and reliable reporting. Tracing data lineage back to its source is crucial to this role.

To support your work, it is essential to clearly understand the data lineage from the report back to the dataset and table. The success of a data quality program can be measured by its return on investment, which should exceed the cost of quality improvement. If you’re interested in exploring data quality in greater depth, we recommend checking out Tom Redman’s book “People and Data,” which delves into the concept of “Friday afternoon measurement.”

Critical Business Case Questions

Figure 8 Critical Business Case Questions

The Importance of Data Quality for Cost Impact on Business Reports

As part of his job, Howard works with various business departments to analyse and measure important data elements from top records. These attributes, such as customer name, size, colour, and transaction amount, are displayed on the screen and are related to retail transactions. To determine the quality of each record, the writer evaluates it and assigns a score out of 100. Business personnel also evaluate these records and assign a score of 1 for perfect and 0 for imperfect.

To ensure data accuracy, the writer applies the “rule of 10,” under which defective data can cost up to 10 times as much as perfect data. The writer calculates the cost impact of mistakes on reports, which can be significant if there are many imperfect records. Based on the findings, you may extrapolate the data and provide an estimated cost impact for the entire dataset.

Using Force Field Analysis and SWOT Analysis for Business Impact

During the presentation, Howard recommended using the “rule of 10” to estimate the impact of various factors on the business. He supported this approach by referring to credible papers found online. Howard also introduced force field analysis as a tool to understand the organisation’s position, similar to a SWOT analysis but with a numeric allocation to quantify forces.

The driving and restraining forces were discussed, including strengths, opportunities, external threats, weaknesses, and capabilities. Examples of technology failures and their impact on the business were provided. Howard explained how restraining forces push the problem down while driving forces alleviate it. He also referenced the Ishikawa diagram as a model for ordering and positioning challenges, highlighting external issues, internal threats and opportunities, and the role of data and technology. The potential benefits of using AI for quality enhancement were mentioned, and the concept of calculating benefits based on uplift from a baseline was explained.

DQ Initial Assessment with Cost Impact

Figure 9 DQ Initial Assessment with Cost Impact

Change Assessment Force-Field Analysis

Figure 10 Change Assessment Force-Field Analysis

High-quality data, improvement cycles, business impact, SWOT analysis, and tiles analysis

Our calculations show that a significant amount of money has been saved, which is a great benefit. The Juran Trilogy emphasises the importance of using high-quality data and continuously improving processes to achieve desired outcomes. To address our chronic waste of 33, we plan to improve by using the burrito principle and select the next target. In the business case, we prioritise data quality over data science.

The fishbone diagram shows how incorrect technology can negatively affect agility. Through tile analysis, we can identify how to leverage our strengths to capitalise on opportunities, such as expanding internationally or launching new products. Our strengths, brand, and reputation can also help us address threats, such as negative online reviews and declining revenue. To determine potential risks, we evaluate our weaknesses and threats.

Example: Business Impact of Data Science

Figure 11 Example: Business Impact of Data Science

Issues Caused by Lack of Leadership

Figure 12 Issues Caused by Lack of Leadership

Threats and Opportunities arising from DQ Challenges

To mitigate weaknesses and drive business growth, data architects should act as agents of transformation and utilise their strengths to seize opportunities and address potential dangers posed by emerging technologies and data quality issues.

One approach is to use Excel spreadsheets to quantify the opportunities and threats the business faces, applying a plus-one multiplier to opportunities, resulting in a positive value. Similarly, the threats and weaknesses can be quantified using appropriate multipliers. Overall, the force field analysis indicates a favourable market position, with a score of seven.

SWOT Analysis and TOWS Analysis

Figure 13 SWOT Analysis and TOWS Analysis

Change Assessment Force-Field Analysis (FFA) for ChatGPT

Figure 14 Change Assessment Force-Field Analysis (FFA) for ChatGPT

Scoring the FFA (Template)

Figure 15 Scoring the FFA (Template)

Establishing a Common Language and Building a Business Case

Establishing common terms and definitions is important to avoid quality issues before selecting a Common Data Environment (CDE). The six friends classification (why, who, what, when, where and how) can be used to define a business definition. For instance, a learner is a person or group who seeks to enhance their professional capabilities by using learning resources. Defining what a CDE means requires time and agreement from multiple stakeholders.

By analysing the people in an organisation, the transition from regular employees to data managers and, eventually, data provocateurs or champions can be determined. The growth path for data management entails moving from regular employees to data managers to data champions. The DQ superhero team includes a data entrepreneur, a data steward, and other team members. When making decisions in the business case, it is important to consider and work through various factors.

FFA: ChatGPT Modelware Systems

Figure 16 FFA: ChatGPT Modelware Systems

Establish a Common Language Approach

Figure 17 Establish a Common Language Approach

The Work Team

Figure 18 The Work Team

Understanding Business Case and Data Quality Dimensions

The topic of the business case is intricate and not specific to any data quality area. When it comes to Master Level questions, this is an area that is often emphasised. During a discussion, one participant inquired if anyone had created a business case for data quality. Henry clarifies that they utilised the 1-10 theory to represent the business case visually and focused on analysing the benefits and costs.

The 1-10 theory helps incorporate data, and tools like Information Steward can display values in the dataset. This section primarily highlights the justification for the improvement, the cost analysis, and the benefits. The DMBOK recognises that there is no single definition of dimensions and that thought leaders have diverse perspectives. This statement alludes to Dan Myers’ idea of conformed dimensions.

DQ and Data Team "Superhero's"

Figure 19 DQ and Data Team “Superhero’s”

Different Perspectives on Data Quality Dimensions

When evaluating data quality, it is important to consider completeness on both a column level and across interdependent fields in a table. The process of populating tables and schemas is crucial to achieving data completeness. Frameworks for assessing data quality, such as Strong Wang’s, categorise dimensions into intrinsic factors like data accuracy, objectivity, believability, and reputation, as well as textual, symbolic, and accessibility dimensions.

Howard notes that Tom Redmond takes a different approach, categorising dimensions into data model, data value, and data representation. Meanwhile, Larry English discusses both inherent and pragmatic dimensions of data quality. Dami UK provides definitions for these dimensions, but it is important to note that there are multiple dimensions to consider. When evaluating data quality, several key factors include having enough data, ensuring accuracy, consistency, and integrity, and ensuring timeliness.

Data Quality Dimensions: 11%

Figure 20 Data Quality Dimensions: 11%

Strong-Wang Framework

Figure 21 Strong-Wang Framework

Ted Redman: Data Model, Data Values and Representation

Figure 22 Ted Redman: Data Model, Data Values and Representation

Larry English, Inherent and Pragmatic

Figure 23 Larry English, Inherent and Pragmatic

Data Quality Dimensions

Figure 24 Data Quality Dimensions

Importance of Dimensions in Data Quality

Businesses can select the key elements that meet their current needs using a set of Dimensions. According to Dan Myers, data accuracy is about agreeing with the real world and the precision of data values. It is important to consider the interconnectivity among different dimensions of data quality. ISO 25 000 is a standard for products, and it includes various levels of data product quality. Ensuring data quality is fit for purpose is crucial and involves considering different perspectives and passing all necessary checks for the chosen dimensions.

Data Quality Dimension Focus

Figure 25 Data Quality Dimension Focus

Data Quality Dimension Relationships

Figure 26 Data Quality Dimension Relationships

ISO 25000 Data Product DQ Dimension, Quality of Data Product

Figure 27 ISO 25000 Data Product DQ Dimension, Quality of Data Product

DQ Fit for Purpose: 12%

Figure 28 DQ Fit for Purpose: 12%

Understanding the Importance of Data Quality for Specific Purposes

When conducting a study, it’s important to determine the specific features and population to be measured, as in the case of measuring diabetes. This is where a cohort comes in handy, as it helps identify the specific patient population to include in the study. By defining the purpose, the data collection requirements and measurement value set can be determined, which ensures that data quality expectations are met. The cohort serves as a system of records, similar to a master data situation, providing necessary information.

It’s crucial to avoid overtraining or overengineering data in machine learning, as this can lead to biased results and poor data quality. Instead, data that’s fit for purpose should follow the Goldilocks principle – neither too much nor too little. Poor data quality can have negative impacts, including incorrect decision-making, regulatory submission issues, revenue loss, increased costs, and reputational damage. Data quality management is critical in addressing these issues and ensuring data reliability. Working with data requires recognising its value and trustworthiness.

Fit for Purposes in Populations Analytics

Figure 29 Fit for Purposes in Populations Analytics

Issues caused by fixing issues.

Figure 30 Issues caused by fixing issues.

How Poor Data Quality Impacts Organisations & Individuals

Figure 31 How Poor Data Quality Impacts Organisations & Individuals

Understanding and Managing Data Quality

It is a common misconception that data is always accurate, which can make it difficult to maintain data quality. Since no organisational business processes are perfect, data quality must be continuously aligned with these processes. It is crucial to convince the CIO that data quality is an ongoing program, not a one-time project. To effectively manage data quality, updating the culture and adopting a quality mindset is essential.

A comprehensive methodology, such as ISO’s data quality framework, can be chosen to ensure comprehensive data quality management. The impact of data architecture, integration, operations, security, and resources on data quality should be understood. 11 common causes of data quality problems include leadership gaps and system design issues. Prevention, correction, automation, and statistical approaches can improve data quality. To understand and analyse data quality, one must familiarise oneself with data profiling tools and their purpose.

DQ Introduction: 10%

Figure 32 DQ Introduction: 10%

DQ Management: 11%

Figure 33 DQ Management: 11%

DQ Management Detailed Structure

Figure 34 DQ Management Detailed Structure

ISO 8000 Parts

Figure 35 ISO 8000 Parts

DQ Operations: 11%, Common Causes of DQ Issues

Figure 36 DQ Operations: 11%, Common Causes of DQ Issues

DQ Techniques

Figure 37 DQ Techniques

DQ Statistical Approaches: 11%

Figure 38 DQ Statistical Approaches: 11%

DQ Tools: 11%

Figure 39 DQ Tools: 11%

Understanding the Importance of Data Profiling and Data Quality Operations

Data quality operations begin with profiling, which involves statistically assessing existing data structures to create effective data quality rules and identify problem areas. Data querying tools can be used to identify issues such as null counts and their locations in the database. Readiness is crucial for successful data quality initiatives, which involve convincing people to shift from an application-centric to a data-centric mindset and assessing the actual state of data.

Both culture and technical readiness play important roles in implementation. Examples of potential exam questions may include analysing the validity of sand composition, understanding census data, assessing data age and currency, addressing field overloading issues, and measuring stakeholder accountability. Continuous improvement and goal setting are essential for data quality operations to ensure the benefits outweigh the costs.

Readiness Assessment

Figure 40 Readiness Assessment

DQ Questions: Application

Figure 41 DQ Questions: Application

DQ Specialist Exam

Figure 42 DQ Specialist Exam

If you would like to join the discussion, please visit our community platform, the Data Professional Expedition.

Additionally, if you would like to watch the edited video on our YouTube please click here.

If you would like to be a guest speaker on a future webinar, kindly contact Debbie (social@modelwaresystems.com)

Don’t forget to join our exciting LinkedIn and Meetup data communities not to miss out!

Scroll to Top