Metadata Management for Data Citizens
Executive Summary
This webinar provides an overview of key topics related to Metadata Management, focusing on the role of Data Citizen, Data Stewardship, and Data Analysis. Drew Kennedy focuses on the importance of Metadata in understanding data origins and quality and implementing effective Metadata Management in business. He highlights the role and responsibilities of Data Stewardship in ensuring data authenticity and quality. The webinar touches on the evolution and impact of Data Analysis tools and outlines strategies and challenges in building a Data Management office. Overall, ‘Metadata Management for Data Citizens’ underscores the significance of collaboration and effective Data Management practices in driving business success.
Webinar Details
Title: Metadata Management for Data Citizens
Date: 16 October 2024
Presenter: Drew Kennedy
Meetup Group: Data Citizens
Write-up Author: Howard Diesel
Contents
Managing Data: Metadata and Data Catalogues
Understanding the Origins and Importance of Metadata
Understanding the Importance of Data Sets in Business
Data Quality Rules and Results in Business Metadata
Data Authenticity, Quality, and Metadata in AI Models
Implementing Effective Metadata Management in Business
Technical Metadata and Data Management
Operational Metadata and Data Management
“Stewardship” in History and Data Management
Understanding the Role and Responsibilities of Data Stewardship
Metadata Management in Business
Role and Responsibilities of Data Stewardship in Business
Operational Data Modelling and Metadata
The Importance of Collaboration: Data Stewardship in Business
The Role and Collaboration in Data Stewardship
Data Stewardship in Business Environments
Dynamics of Data Stewardship in Business
Evolution and Impact of Data Analysis Tools
Building a Data Management Office: Strategies and Challenges
Managing Data: Metadata and Data Catalogues
During the webinar, Howard Diesel asked Drew Kennedy about the responsibility for the data catalogue and its similarity to data lineage in consolidating data assets and their locations. Drew starts by emphasising the importance of business Data Stewards contributing to the catalogue and the capabilities of new tools for scanning databases. He then touches on best practices in managing Metadata, such as avoiding data overload and integrating Metadata Management with input from various areas such as quality, BI, integration, and Architecture. Drew highlights the business glossary to front Metadata and its benefits in staff induction and as a living document for business terms and data-sharing agreements.
Drew refers to Metadata as a "Rockstar," emphasising its role in impact analysis and avoiding data duplication. Metadata plays a significant role in understanding data lineage, especially in regulatory compliance, such as BCBS 239, where it is necessary to demonstrate the journey of data from its source to its utilisation. Drew also emphasises the involvement of business stakeholders in understanding Metadata, noting its potential dividends.
Figure 1: Table of Content
Understanding the Origins and Importance of Metadata
The previous webinar, ‘Metadata Management for Data Managers,’ covered the origins of Metadata; Drew also mentioned Alexandria’s libraries and the Clementine Library in Prague. He then reminds the attendees of the use of Metadata in providing specific information, and the different types of Metadata, including business terms and synonyms. The importance of capturing business rules in Metadata is highlighted, along with examples such as age restrictions for new customers and specific risk calculations.
Figure 2: The Burden of the Metadata Manager
Figure 3: The Practise of Metadata in Librarianship
Figure 4: Who, What, Where, Why, When and How of Metadata
Figure 5: Business Metadata
Understanding the Importance of Data Sets in Business
The definition of data sets, tables, and columns is crucial in understanding the structure and purpose of data. When a data set is stored in a data lake, such as "weekly sales by customer, by store, by product," it's essential to know the columns and other details, including stakeholder information. This knowledge is vital for ownership, change control, and ensuring downstream stakeholders know the data set's attributes and usage. Whether the data is used for marketing campaigns or creating data scores, it's important for those engineering the data to be aware of the downstream stakeholders for effective management and collaboration.
Data Quality Rules and Results in Business Metadata
Business Metadata includes data quality rules and results, providing insight into the accuracy and standards of datasets over time. Technical Metadata offers a closer look at the actual results while documenting data standards and transformations, such as rolling up daily sales to weekly and monthly sales. This documentation ensures alignment with data quality rules and defines transformation rules for business terms like "weekly sale" and "monthly sale," specifying conditions for their calculation and processing.
Data Authenticity, Quality, and Metadata in AI Models
Data lineage and quality assurance are paramount, especially when dealing with AI models and monetising data. Documenting data-sharing agreements and business Metadata can help prevent ambiguity, duplication, and costly errors. Organisations can ensure that sensitive information is properly guarded and archived by categorising data based on security and privacy classifications. Implementing Metadata-driven rules can streamline Data Management processes while addressing retention and privacy recommendations. This structured approach to Metadata complements the need for clear business terms and attributes, providing a robust framework for Data Governance and compliance.
Implementing Effective Metadata Management in Business
A discussion starts with the classification of Operational Metadata defined by the DMBoK. It then delves into the distinction between Business and Operational Metadata, particularly in the context of data-sharing rules, agreements, archive and retention, and security education. Emphasis is placed on understanding the data life cycle and the challenges of classifying Metadata at an attribute level across various systems. The importance of referencing the DMBoK for accurate answers is stressed, and Drew advises double-checking responses based on the data life cycle.
Technical Metadata and Data Management
Technical Metadata encompasses various data properties from dictionaries, including database properties, column names, physical tables, access permissions, and CRUD (Create, Read, Update, Delete) rules. It also involves E (Execute) rules for executing stored procedures and functions, ETL rules, source-to-target mapping, data lineage documentation, API rules, data quality reports, data access rights, and data groups. Understanding Technical Metadata is crucial for maintaining consistency and facilitating debugging, especially when sharing data via APIs. This information is typically stored in repositories and data dictionaries and is essential for anyone working with data at a technical level.
Figure 6: Technical Metadata
Operational Metadata and Data Management
The Operational Metadata is present in archiving, retention, error logs, job schedules, batch logs, replication status, audit and control measures, data movement details, and data processing and access information. The focus must be on understanding data movement and what happens during data movement, as well as the volumetric and usage patterns, to anticipate and address potential performance issues. Operational Metadata aids in monitoring data processing and accessing, identifying data at rest and data movement, and understanding usage patterns to optimise performance and set appropriate SLAs.
Figure 7: Operational Metadata
“Stewardship” in History and Data Management
The concept of Stewardship involves the responsibility to care for something that is not one's own and implies duty, trust, and accountability. Stewardship has been evident throughout human history in various cultures and societies, with examples ranging from historical practices to modern-day Data Management. Specifically, Data Stewardship involves the ongoing responsibility of ensuring that data remains valuable to the enterprise, with Data Stewards managing data assets in the organisation's best interests.
Figure 8: Definitions of "Stewardship"
Understanding the Role and Responsibilities of Data Stewardship
Drew talks about the roles and responsibilities of Data Stewards, focusing on business Data Stewards and technical Data Stewards. Business Data Stewards are typically individuals involved in defining requirements, setting standards, and managing system enhancements, often with a deep understanding of the business domain. In specific industries, such as retail and insurance, business Stewards may include buyers, buyer's assistants, or sales personnel. On the other hand, technical Data Stewards focus on technical aspects like data quality, security, and integration, with expertise in modelling, warehousing, and governance tools. There is also mention of Metadata Stewards who collaborate with business and technical Data Stewards to maintain Metadata and refine data onboarding processes. Drew also mentions the involvement of Data Owners and the need for collaboration between different types of Stewards to ensure effective Data Management.
Figure 9: Types of Data Stewards
Metadata Management in Business
A discussion starts on managing business Metadata and the need to tailor information for specific stakeholder groups. Drew mentions that Metadata tools allow for the creation of tailored views for different audiences, ensuring that individuals receive the necessary level of detail without being overwhelmed. The importance of organising Metadata by domain and critical data elements within those domains and the need to align with business and data strategies is highlighted. The approach of starting with a high-level overview and drilling down into deeper details for technical users or analysts was also discussed.
Role and Responsibilities of Data Stewardship in Business
Data Management involves various aspects, such as Operational Metadata, Technical Metadata, Business Metadata, and unstructured data. Data Stewards, including business Data Stewards and technical Data Stewards, play crucial roles in capturing and managing information. Business Data Stewards are responsible for understanding the business definition of terms, capturing information on data creation and usage, and ensuring compliance with rules and regulations. Technical Stewards focus on the technical aspects of data, including storage, usage, and backup. They also manage data flow, ownership, and technical naming standards. Additionally, Drew notes that Data Stewards need to be aware of security and privacy requirements, regional data regulations, and business drivers for the data. They should also track data creation dates, update frequencies, purging needs, formatting, and storage locations.
Figure 10: "It's easy to be a Data Steward"
Operational Data Modelling and Metadata
Drew moves on to the concept of Operational Metadata and its importance in understanding technical Metadata. A discussion then leads to the role of a Metadata Steward in managing the storage, origin, usage, and quality of Metadata, including its sources, such as Oracle or data modelling tools. There is an emphasis on incorporating Metadata considerations into the overall meta model, as well as the need to understand the Architecture and stakeholders involved in managing Metadata.
The Importance of Collaboration: Data Stewardship in Business
Gathering Metadata involves extracting information from various sources such as physical databases, EOP systems, CRM systems, BI tools, integration tools, quality tools, APIs, data models, data sharing, program specs, and dictionaries. Collaboration is essential for a Data Steward as data is often not black and white, requiring persuasion and cooperation with others. Different types of customers, such as loyalty customers and pharmacy patients, have unique attributes and data requirements, necessitating collaboration with different departments to ensure consistency and quality. Lack of collaboration is escalated to Data Governance tiers to address any issues.
Figure 11: Metadata is the "Who, Where, What, Why, When and How" of Data
Figure 12: Some Sources of Metadata
The Role and Collaboration in Data Stewardship
Collaboration between the Business Data Steward and various other teams is crucial for ensuring accurate and valid Metadata. This includes Data Architects, Data Engineers, Data Scientists, Data Analysts, the users, and the risk and compliance privacy teams. Additionally, Metadata Stewards and the Data Governance team play a key role in this process.
The Technical Data Steward collaborates extensively with Data Architects, Data Engineers, Data Analysts, Data Scientists, Data Quality teams, Database Administrators, the Metadata, business, Data Governance, and integration teams. This collaboration involves capturing and validating Metadata, enhancing data, and ensuring its accuracy and value.
The Business and Technical Data Stewards play a vital role in understanding user and business requirements, contributing to data models, and driving Metadata standards and delivery. Their collaboration leads to improved teamwork, efficiency, and successful implementation of Data Strategies.
Figure 13: "Where Collaboration Starts"
Figure 14: Business Data Steward Collaboration
Figure 15: Technical Data Steward Collaboration
Figure 16: Sourced from DAMA DMBoK V2
Data Stewardship in Business Environments
Data Stewardship responsibilities can be divided among various individuals and teams in a banking environment. Subject matter experts and professionals such as engineers are often chosen as Data Stewards due to their deep understanding of how the business operates and their ability to collaborate effectively. Different departments within the organisation may have their own Data Stewards and, in some cases, associate Data Stewards report to the primary Data Steward. Data Stewardship responsibility may also shift based on the specific data being handled, such as customer data related to sales, credit, insurance, and geographic information. In some cases, data may be owned by the Enterprise, requiring an Enterprise Data Steward to ensure consistency and context across the organisation. Additionally, in one example, the responsibility for maintaining a customer 360 view fell to the person in charge of customer analytics, highlighting the importance of passion and expertise in managing specific data sets.
Drew highlights the complexity of maintaining quality, accuracy, and trustworthiness in a business setting, emphasising the importance of dedicated individuals. He recounts an experience with an insurance company and a shared service, explaining the challenges encountered with different product lines and customer ownership. Drew then humorously recalls a broker’s resistance to allowing customer interaction for quality control purposes.
Dynamics of Data Stewardship in Business
The trend of Data Governance specialists transitioning into Business Data Steward roles is mentioned. Drew notes a specific individual's move from a governance position to a business unit. He then expresses surprise at this shift, speculating that the individual may have benefited from implementing governance requirements within a business unit. A discussion then focuses on the evolving roles within Data Management, including the distinction between technical custodians responsible for data integrity and DBAs ensuring data availability. Overall, Drew appreciates the complexities of Operational, Business and Technical Metadata.
Evolution and Impact of Data Analysis Tools
Drew touches on the growing trend among analytical tools to focus on becoming the primary repository for Metadata and data cataloguing. Many organisations purchase governance, quality, and modelling tools with integrated Metadata capabilities. For example, IBM's Information Governance Catalogue (IGC) is built on a banking data model and offers extensive intelligence for reporting and lineage at a significant price point. This development suggests that other tools may follow a similar path, leading to an enhanced landscape for Metadata Management and Data Governance across various platforms.
Building a Data Management Office: Strategies and Challenges
When building a Data Management office from scratch, starting with one of the primary data domains, such as customer or product, and identifying the critical data elements within that domain is recommended. This approach allows for a focused alignment with the business strategy, such as customer centricity and incremental development of Data Management capabilities. Demonstrating results through small, value-delivering use cases and focusing on Critical Data and Metadata elements, Data Quality, and privacy considerations are essential for success. Additionally, considering the interconnectedness of various domains and the impact on overall business goals is crucial when demonstrating results based on key performance indicators (KPIs).
Drew emphasises the importance of understanding business needs and objectives before implementing policies and procedures. He talks about the significance of conducting a maturity assessment for specific domains and spending time in the business to identify challenges and opportunities related to cross-selling and up-selling. Drew then shares his frustration with business analysts who lack an understanding of business acumen and suggests better communication between data professionals and business stakeholders. Lastly, he mentions the value of learning from experienced business leaders and gaining insight into business operations to translate requirements into actionable data strategies effectively.
If you would like to join the discussion, please visit our community platform, the Data Professional Expedition.
Additionally, if you would like to be a guest speaker on a future webinar, kindly contact Debbie (social@modelwaresystems.com)
Don’t forget to join our exciting LinkedIn and Meetup data communities not to miss out!