Records Management – An Integral Part of Data Governance with Dr. Paul Mullon
Executive Summary
This webinar outlines key themes in records management, as presented by Dr. Paul Mullon, highlighting the distinctions between information governance and data governance, and emphasising the critical role of records management within these frameworks. It explores essential concepts, including records management standards, the evolution of artificial intelligence, and the intersection of knowledge management with information governance.
Dr. Paul Mullon addresses practical challenges, including balancing terminology in interdisciplinary environments, long-term preservation strategies, and classification related to information security. Additionally, he considers the implications of AI on master data systems, file plans in government settings, and the need to address churn rates across various departments while aligning data management practices with organisational objectives.
Webinar Details
Title: Records Management – An Integral Part of Data Governance with Dr. Paul Mullon
Date: 2025-07-16
Presenter: Dr. Paul Mullon
Meetup Group: DAMA SA User Group Meeting
Write-up Author: Howard Diesel
Contents
Records Management with Dr. Paul Mullon
Distinguishing between Information Governance and Data Governance
Understanding the Importance of Records Management
Understanding Records Management and Data Governance
Records Management Standards and Concepts
The Concept of Information Governance and Data Management
Balancing Practicality and Terminology in Interdisciplinary Environments
Evolution and Challenges of Artificial Intelligence
Intersection of Knowledge Management, Information Governance, and AI
Understanding the Frameworks in Records Management
Long-Term Preservation and Records Management
Classification in Records Management and Information Security
AI, Master Data Systems, and File Plans in Government Environments
Addressing Churn Rates in Different Departments
Data Management and Organisational Alignment
Records Management with Dr. Paul Mullon
Veronica opened the webinar and expressed her excitement about Dr. Paul Mullon's return, recalling their last interaction at a pre-COVID chapter meeting hosted by Standard Bank in Rosebank. She emphasised the significance of Paul's insights on records management, noting that while not everyone in attendance specialises in the field, the topic remains relevant.
Distinguishing between Information Governance and Data Governance
Dr. Paul Mullon, who has a background in IT and enterprise content management (ECM), shared insights from over 45 years in the industry, particularly in the realm of unstructured content. He highlighted his early 2000s contribution to defining "information governance," a concept that has since undergone significant evolution.
Initially involved in image and document management, Paul noted the integration of records management with emerging concerns, such as privacy and information security. He emphasised the disconnect between structured and unstructured data, drawing from his background in database management during their early days at IBM and his subsequent transition into the unstructured content space.
Paul shared that he has extensive experience in ISO standards development, particularly in enterprise content management and records management, and was part of the team that authored the standard on information governance. He emphasised a distinction between records managers and data management, noting that records managers often lack an understanding of databases and modern digital formats, focusing instead on tangible artefacts that resemble traditional paper documents. Currently, Paul shared that he is involved in creating an ISO standard aimed at managing data as a record, highlighting his unique perspective on the intersection of information governance and data management.
Identifying as an IT professional and acknowledging involvement in the field of information management, Paul emphasised the differences in terminology between IT and records management. He is part of a drafting team focused on electronic discovery (E-discovery) and bridging the gap between the IT world and records management through the development of a terminology crosswalk. Lastly, Paul highlighted that information governance and data governance are distinct concepts, noting the existence of a separate profession—comprising librarians, archivists, and scanning experts—that operates alongside IT professionals, further complicating discussions about information management.
A question was asked on the distinction between information governance, data governance, and IT governance. The attendee emphasised that the King 5 draft report positions information governance as a key focus, potentially indicating that it holds more influence in the information governance sector than in data governance. On this, Paul reflected on his long-standing advocacy for this differentiation, noting that while these terms are often used interchangeably, they represent distinct areas of governance with unique implications and responsibilities.
Paul then moved on to the interconnected yet distinct nature of IT governance and information governance, highlighting a notable shift in King 4 towards recognising this difference by addressing both information and technology governance. He touched on his previous interactions with King, acknowledging a certain level of arrogance regarding the importance of encompassing information governance in the dialogue. Additionally, Paul noted the emergence of various professional disciplines within this field, each valid and robust, but expressed concern about the overlaps that may exist between them.
Figure 1 Understanding Records Management
Understanding the Importance of Records Management
The importance of proper records management, data governance, and data management in ensuring privacy was emphasised during presentations by the information regulator, particularly in government settings. While records management is a recognised and important discipline with established professionals, it is not the sole focus; understanding the nuances between various frameworks is crucial.
Paul then highlighted definitions from the National Archives and Record Services Act of South Africa, which governs records management in the public sector, alongside ISO definitions relevant to the private sector, showcasing both the similarities and the distinct approaches of each. Attendees were encouraged to familiarise themselves with the terminology and frameworks to navigate the interconnectedness of these fields better.
Figure 2 Agenda
Figure 3 A Few Definitions (NARSSA)
Understanding Records Management and Data Governance
Records management encompasses the entire life cycle of records, including their creation, maintenance, use, and disposal. It is vital for proper governance, as a record is defined as recorded information in any formal medium, which includes paper documents, electronic records, scanned images, photographs, videos, and audio files.
These documents serve as reliable evidence of an organisation’s activities, allowing individuals to assert definitively what actions were taken by a person or organisation. Understanding these definitions and concepts is essential for professionals across various fields.
The distinction between how database specialists and records managers view data and records is significant, leading to challenges in establishing a unified standard for managing data as a record. While both involve recorded information, their interpretations diverge, reflecting a broader disconnect between IT governance and records management.
Over the past six years, efforts to create a standard have not progressed, highlighting the confusion around key definitions, such as the relationship between data and information. Debates on whether data encompasses information or vice versa continue, as illustrated by discussions among professionals in the field.
The key takeaway from the discussion is the importance of recognising and protecting data or information assets within an organisation, regardless of their format—be it digital, paper, audio, or visual. Professionals are encouraged to move beyond the debate of terminology and focus instead on the effective management of all information assets. It’s vital to safeguard and treat these assets as valuable resources that require protection and responsible handling.
Figure 4 A Few Definitions (ISO)
Records Management Standards and Concepts
The organisation adheres to two key standards: the worldwide Records Management standard and the management system standard (ISO 3300 series), specifically ISO 3301, which aligns with other well-known standards, such as ISO 9001 and ISO 27001. These standards emphasise the criteria for a trustworthy record, focusing on attributes such as authenticity, reliability, integrity, and usability.
Introduced during a revision in 2016, these terms resonate with the principles of information security (CIA), reflecting a shared terminology across different fields. This approach simplifies the discussion by consolidating these critical concepts into a cohesive framework.
The term "authoritative" refers to information, data, or records that embody key characteristics such as authenticity, reliability, integrity, and usability, making them trustworthy. This concept highlights the importance of ensuring that the sources we rely on are credible and can be fully trusted in various contexts.
Figure 5 A Few Definitions (ISO) Pt.2
The Concept of Information Governance and Data Management
The distinction between controlled information and controlled documents in the context of the 9001 quality management system has historically caused confusion. Practitioners often grappled with whether something was a document, a record, or both. To address this, a broad definition was proposed, encompassing any information or data that an organisation needs to manage, regardless of its medium.
This approach aims to eliminate artificial distinctions and emphasises that organisations are fundamentally concerned with similar concepts. Additionally, it's essential to recognise the established disciplines within information governance, including IT governance, master data management, information risk, and information security, which are crucial for effective corporate governance.
Paul emphasised the importance of recognising the equal value and roles of various governance frameworks, such as IT governance, within the broader context of integrated information governance. He then referenced principles from the Information Governance Body of Knowledge (IG Bok). He acknowledged the similarities and differences with the Data Management Body of Knowledge (DM Bok), particularly regarding the emphasis on retention and disposition.
Effective governance requires the establishment of clear strategies, policies, and procedures, which can highlight the potential for duplication when specialist teams create their own standards. To avoid redundancy, an integrated approach is crucial for aligning policies across information governance, records management, and risk management.
Data management serves as an overarching category that encompasses various disciplines, including master data management, data quality, and data security. Considering this breadth, it may be beneficial to adopt "data management" as a catch-all term, similar to the concept of knowledge management.
Figure 6 Integrated Information Governance
Balancing Practicality and Terminology in Interdisciplinary Environments
Paul emphasised the importance of using precise terminology in different disciplines, acknowledging that inconsistencies can lead to confusion. While he would advocate for strict adherence to terminology, Paul shared that he also recognises the necessity of flexibility based on the specific environment.
In certain contexts, terms such as "master data management," "data management," "IT governance," and "data governance" may be used interchangeably. Paul encouraged a pragmatic approach, suggesting that while it's essential to stand firm on critical terminology, there are times when one should adapt to what works best in a given situation.
Evolution and Challenges of Artificial Intelligence
The importance of having a long-term roadmap, such as a 5- to 10-year plan, is highlighted, particularly in the intent to align and guide change effectively, as progress tends to be gradual and incremental. Reflecting on his personal experience, Paul shared that his journey of developing a seven-year roadmap during his PhD aimed at formalising his work to make it applicable in industry.
Paul shared his recent experiences delving into the field of artificial intelligence (AI), highlighting the contrast between his prior programming knowledge, which included languages like COBOL and RPG, and the newer technologies he has been exploring, such as Python. Attending an in-depth AI course in Dubai, they noted the challenge of assimilating numerous new terms and concepts. Additionally, Paul emphasised that a comprehensive understanding of AI also requires knowledge of mathematics and data science, suggesting that the complexity of these subjects necessitates efforts to simplify them for better comprehension.
Figure 7 Positioning All the Pieces
Intersection of Knowledge Management, Information Governance, and AI
The effectiveness of knowledge management relies on having proper infrastructure to capture, store, protect, preserve, and share this information. Ultimately, the framework is divided into two key areas: information management and information governance, underscoring the necessity of systems to support knowledge management efforts.
Paul highlighted the significance of data governance in both structured and unstructured data environments, emphasising that a substantial majority—estimated at 80 to 90%—of information is unstructured. Despite traditional frameworks, such as DM Bok 2, offering only a limited representation of this unstructured volume, it is essential to recognise the evolving relationship between data privacy, protection, and artificial intelligence (AI).
As AI continues to advance, the distinctions between structured and unstructured data are becoming increasingly blurred. This underlines the need for a comprehensive approach to data management that considers these interconnected aspects. Additionally, structured data needs to become evident, as unindexed information lacks utility without direct access to raw data.
While Optical Character Recognition (OCR) allows for the extraction of data from unstructured documents, it primarily serves as user input into AI systems, rather than producing definitive AI-generated outputs. Understanding the interrelationship between unstructured documents and AI-generated data is increasingly critical. For instance, recent work with AI applications has demonstrated the effective interrogation of architectural diagrams, identifying discrepancies and compliance issues, highlighting that AI can operate successfully in both graphical and text-based environments.
Paul then highlighted the evolving nature of data extraction from various sources, including images, technical drawings, graphs, and photographs, emphasising that this process extends beyond just text. As visual AI continues to advance, the distinction between what constitutes a record and what does not may become increasingly ambiguous. Attention was drawn to the management of document versions, where typically only the final signed version is considered a record, while earlier iterations remain duplicates that require handling. This leads to the importance of data governance in effectively organising and controlling the multitude of documents and their versions within an organisation.
Data governance encompasses a comprehensive approach that includes various components of records management. This involves establishing records management policies, retention schedules, procedures, and file plans. Lastly, Paul mentioned that in the context of government, this also refers to schedules for records beyond just correspondence, highlighting the importance of classification schemes for information.
Understanding the Frameworks in Records Management
The framework for effective records management emphasises several key Success Factors essential for success: management commitment, well-defined policies, procedures, and controls, and robust change management to oversee programs and projects. It must align with legal, evidential, regulatory, and business requirements, reflecting the organisation's mandates and strategic objectives while encompassing all types of information across various media.
Paul shared that there is a notable disconnect in the records management field concerning the definition of records, as some professionals struggle to differentiate between traditional records and data stored in databases. This highlights the need for a comprehensive understanding of all record types, extending beyond HR and finance-related documents.
An attendee then asked a question on classification, to which Paul emphasised the importance of a structured approach to managing records, specifically referring to a business classification scheme or file plan in this context. This involves organising records, excluding correspondence, to facilitate easy searching, retrieval, and management.
Key elements include the application of metadata related to retention, permissions, and access controls, as well as the establishment of disposal authorities that dictate the approval for the destruction or transfer of records. Additionally, the term "archive" is interpreted differently, signifying that, in government practice, records are transferred to the National Archives after 20 years.
In the field of IT, data management involves identifying when to move less frequently accessed information to secondary storage, a process often referred to as archiving. This differs from the records management definition of archives. This process may involve using slower disks or other storage locations and encompasses concepts such as caching and distributed repositories.
Effective data management also requires addressing decongestion, which entails either transferring data to an archive or securely destroying it once it no longer holds value. This is particularly challenging with structured data, especially concerning privacy issues, where parts of a dataset may need to be deleted while retaining others. Additionally, managing emails as records presents unique challenges that records managers have traditionally struggled with. Vital records, which are essential for an organisation's operation and difficult to recreate if lost, necessitate extra diligence and protection.
In the context of identifying vital records necessary for long-term preservation, Paul noted that while many associate disaster preparedness with backups, it's essential to recognise that vital records are those without which an organisation cannot function. Examples include medical patient records and student records. He then highlighted the difficulty in pinpointing specific examples, as many may claim that all records are vital; however, it's essential to distinguish that not all records hold the same level of criticality, as some can be recreated or sourced from other parties.
Figure 8 Records Management Framework
Long-Term Preservation and Records Management
Long-term preservation of records, whether in digital formats such as Word or PDF or on paper, requires proactive measures to ensure their integrity and accessibility over time. The two primary methods involve converting electronic documents to PDF/A, a format specifically designed for archiving, and conducting regular industry scans to identify technologies and formats at risk. This is crucial for effective records management, especially for those records that must be retained for extended periods.
Paul noted that record managers are often deeply passionate about their records, sometimes viewing them as akin to children, which contrasts with the more impersonal perspective commonly held by data professionals. Ultimately, these records and data belong to the organisation, highlighting the importance of a collective approach to record preservation.
The importance of a tangible connection to records management was discussed with Paul, who compared it to librarianship, where individuals are drawn to handle and preserve physical materials. He then emphasised that reliable and trustworthy archives cannot exist without dependable records, which in turn depend on the integrity of the processes that generate them. Therefore, this highlights the need for records managers to focus on ensuring the reliability of documents and the effectiveness of business processes, creating a solid foundation for trustworthy archival practices.
Paul then highlighted a disconnect between data management and records management, emphasising that the National Archives focuses on managing provided materials, which is inherently flawed. He pointed out that data acquires context only when it is integrated into business processes, suggesting that metadata and related information are deeply integrated into their workflow. This process ultimately leads to the records team, indicating that the current handling of records and metadata does not occur in real time within the electronic environment, where such data should ideally be generated.
The process of information management involves several critical steps: understanding the origins and creation of information, including who creates it, when, how, and what system is utilised, as well as the necessary metadata to capture. Once this information is generated, it must be officially declared a record and placed into a formal repository for either archiving or destruction.
Key considerations include identifying the sources of information entering the organisation, the formats in which it is presented, any conversions that occur, and establishing responsibility for classification against a file plan or business classification scheme. This discussion is particularly significant from a privacy perspective, as it encompasses various formats, including paper documentation.
Records managers often insist on retaining the physical paper as the official record, even after digitisation through scanning, which creates images along with metadata and incorporates electronic documents from emails. According to the POPIA framework, data gathered for a specific purpose must be destroyed once that purpose is fulfilled, necessitating the identification and cohesive management of all related artefacts across different locations. This complexity emphasises the importance of understanding that managing data goes beyond mere records retention, raising crucial considerations for effective information governance.
Figure 9 Interrelationships
Figure 10 Information Processes
Figure 11 Forms of Information POPIA Considerations
Classification in Records Management and Information Security
Paul highlighted the need for adherence to specific terminology while also emphasising the importance of flexibility based on context, particularly in fields such as "master data management," "data management," and "governance." He advocated for a pragmatic approach that allows for adaptation to suit particular situations.
As a member of the International Council on Archives (ICA), Paul shared that he is part of an expert group focused on harmonising terminology between the DM Bok and IG Bok, specifically in the context of information security. He added that the aim is to develop a common understanding that enables clear communication among different professionals, particularly in the fields of records management and related areas.
The challenge of holistically classifying data, both structured and unstructured, is crucial for effective risk management and privacy governance. Currently, discrepancies arise between data classifications used by data governance teams and those employed by data protection officers (DPOs), which complicates retrieval during data subject requests.
To streamline this process, it is essential to align and converge the underlying metadata across systems to ensure consistent labelling of the same data throughout its lifecycle. This alignment will enhance business processes by providing uniformity and clarity at every stage of data management.
Figure 12 Classification-Centric Approach
AI, Master Data Systems, and File Plans in Government Environments
In 2006, while working with a large bank, Paul shared that he unknowingly developed a concept for a master metadata system, which I now refer to as an extended file plan. This approach is especially useful in government contexts, where the traditional file plan serves as a starting point but often proves to be limited. Paul noted that his extended file plan functions like a highly enhanced spreadsheet that incorporates a comprehensive inventory of information types and classification areas, including permissions, access controls, storage systems, and sensitivity levels (such as secret or top secret). Additionally, it effectively categorises risks associated with different data types, and I would be happy to share an example for further clarity.
The process involves utilising a spreadsheet to systematically assess key factors related to risk, security permissions, privacy, and records retention. This approach serves as a business classification scheme, often referred to as a master file plan or extended file plan in the government context, enabling a structured way to categorise and manage information effectively.
Addressing Churn Rates in Different Departments
An attendee then shared an experience with the marketing and sales departments of an accounting firm that were struggling to define and report on churn, resulting in a conflict over a single churn metric on the CEO's dashboard. Both departments had different definitions and formulas for churn due to their distinct roles within the organisation.
To address this issue, it was brought to the CEO's attention, who was informed using Porter's value chain. This analysis highlighted that each department represents a distinct segment with unique churn metrics that should be tracked separately. This approach clarified the need for two distinct fields in their system, resulting in an agreement to change the terminology: "sales churn" and "marketing churn" would be used to avoid confusion going forward.
Paul then addressed the issue of inconsistent terminology. He noted that it was essential to establish a data dictionary or glossary of terms to clarify the meanings and contexts of various classifications, particularly in the areas of security and privacy. This approach will help identify how terms are used and prevent the conflation of similar concepts. Additionally, by focusing on specific classifications rather than generalising, we can ensure more accurate communication and understanding in various areas.
Figure 13 Sensitive Misalignment
Data Management and Organisational Alignment
The often-neglected topic of organisational efficiency was highlighted. Paul emphasised the importance of aligning diverse perspectives towards achieving business objectives, rather than becoming overly specialised in individual roles. It was noted that organisations exist to fulfil specific goals, and thus, a collaborative approach is crucial for achieving these objectives.
The challenge lies in distilling complex business knowledge into a clear, consistent data format that maintains integrity throughout the data lineage and aggregation processes. This requires both attention to detail and an understanding of the broader organisational context to ensure effective reporting and alignment with business outcomes.
Paul then highlighted the importance of properly categorising and aligning data within an organisation, especially after mergers and acquisitions that introduce new cultures, terminologies, and data definitions. While it's necessary to maintain distinctions when data represents fundamentally different concepts, alignment is crucial in areas where similarities exist, such as avoiding redundancy by having multiple versions of the same data that do not serve the business effectively. Ultimately, the focus should be on the organisation's objectives, understanding that effective data management is driven by the need to support business goals and mandates.
Figure 14 "Must Haves"
Figure 15 Questions
If you would like to join the discussion, please visit our community platform, the Data Professional Expedition.
Additionally, if you would like to be a guest speaker on a future webinar, kindly contact Debbie (social@modelwaresystems.com)
Don’t forget to join our exciting LinkedIn and Meetup data communities not to miss out!