EconData for Data Managers

Executive Summary

This webinar offers understanding and effective utilisation of economic data for professionals in various fields. The Statistical Data and Metadata Exchange (SDMX) information model has emerged as an international standard for data management and definitions in economics, offering benefits such as definitional consistency, automation in data cataloguing, and simplified access to public domain data. Data democratisation has become increasingly vital, particularly in South Africa, which presents unique challenges. EconData is a key economic data provider, centralising access to different versions of data sets and data flow and offering data as a service through the SDMX data model. Metadata and registry structure are crucial components of the SDMX data model, allowing for effective analysis and reporting. By leveraging the benefits and features of the SDMX data model, professionals can gain valuable insights and improve decision-making processes.

Webinar Detail

Title: Econ-Data for Data Managers

Date: 02 June 2022

Presenter: Daan Steenkamp and Byron Botha

Meetup Group: Data Managers

Write-up Author:Howard Diesel

Contents

The Economic Value Proposition of EconData

Importance of Understanding Economic Data

Importance of Data Management and Definitions in Economics

Features and Benefits of the SDMX Information Model

Benefits of Definitional Consistency and Automation in Data Cataloguing

Benefits of the SDMX Information Model for Reporting and Publishing

Data Democratisation and Challenges in South Africa

EconData: Centralising and Simplifying Access to Public Domain Data

Overview of Data Access and User Interaction

Analysis of Different Versions of Data Sets and Data Flow

Data Providers and Registry Structure in a High-Level Overview

Key Value Proposition of Econ Data and Its Benefits

The Benefits and Features of Data as a Service and the SDMX Data Model

Introducing Econ Data as a Data Provider for Balance Sheets

SDMX and Metadata in Data Sets

The Economic Value Proposition of EconData

EconData is a centralised platform for managing economic data from various sources. It is built on the widely adopted SDMX information model and is available as open-source software. By centralising internal and external economic data, EconData improves access to South African public domain data, making it discoverable and easy to use. The platform also provides valuable data management and governance attributes, enabling automation and updates. EconData is designed to ensure definitional consistency, adhere to SDMX standards, and promote data democratisation. Decision-making becomes more efficient and effective with easy access to the right information.

Value propositions for different personas

Figure 1 Value propositions for different personas

EconData Value Proposition

Figure 2 EconData Value Proposition

Essential Concepts

Figure 3 Essential Concepts

Importance of Understanding Economic Data

The management of high volumes of data from various sources can be challenging. In regulated firms and industries, it is crucial to identify inconsistencies between data and address them. It is also important to know where specific data is stored and how it is used. As reporting requirements for economic data increase, it becomes necessary to understand different high-level economic concepts, which may have different definitions and relationships. Although authoritative sources exist for some economic concepts, supplementary estimates from other sources may be available. Important aspects of economic data include frequency, transformations, and security classifications. Economic time series often undergo revisions, so storing different data vintages is essential. Finally, assessing the quality of economic time series is critical for data management systems.

Definitional consistency of economic concepts

Figure 4 Definitional consistency of economic concepts

Importance of Data Management and Definitions in Economics

The different definitions of balance sheets based on periods and corresponding versions are discussed, as well as the separate concepts and measures of factors of production. Daan Steenkamp highlights the limitations of official measures of economic activity, such as GDP, and the importance of considering other data sources with higher frequency to provide a more complete picture. He also emphasises the significance of understanding the context-specific differences in data management and definitions across economies and over time for effective data management and economic analysis.

Features and Benefits of the SDMX Information Model

Statistical agencies, institutions, and data providers like Eurostat, central banks, multilateral institutions, and the World Bank widely use the SDMX information model. It provides a standardised structure for data and metadata, making it easier for machines to read and update databases. The SDMX model allows the combination of information from different data sets within and across organisations, simplifying data integration. It also enables transparency and governance by documenting data construction and management processes. The SDMX data registry provides an efficient way to store metadata artefacts, allowing for easy mapping and querying of data from different sources. Additionally, detailed metadata can be attached to data, including information on content, methodology, and data quality. The metadata structure definition helps manage and present concepts, dimensions, and code lists, providing valuable information for data users. Finally, the SDMX information model enables the classification of data for reporting and publication purposes and the construction of taxonomies for different groups of data flows and metadata.

Why SDMX (Statistical Data and Metadata Exchange)?

Figure 5 Why SDMX (Statistical Data and Metadata Exchange)?

SDMX in practice

Figure 6 SDMX in practice

SDMX metadata

Figure 7 SDMX metadata

Benefits of Definitional Consistency and Automation in Data Cataloguing

The benefits of automating data cataloguing through definitional consistency and mapping to concept schemes are discussed. This process helps to eliminate the need for manual intervention and ensures that data entries are consistent and accurate. By defining concepts and metadata structures, users can streamline the retrieval and analysis of data. Sharing metadata structures between organisations promotes collaboration and enhances global consistency. The upcoming version, SDMX.3, extends mapping capabilities and introduces features for new data types, like geolocation data and data transformations. Additionally, integrating SDMX namespace references in value taxonomies allows for labelling the value of time series.

Benefits of the SDMX Information Model for Reporting and Publishing

The SDMX information model is a powerful tool for controlling reporting and publishing. It defines what data is allowable and enables user allowances for specific subsets of the data. Additionally, it identifies who is responsible for providing the data and allows for linking to specific data provision agreements. This is important for the governance process around data management and enables responsible data sharing.

The SDMX model also constrains the scope of data or metadata that can be supplied to specific users while enabling organization-wide access. In contrast, legacy data management approaches struggle to achieve such controlled and responsible data sharing.

A central bank's experience highlights the challenge of controlling access to data before it is sent to external parties. However, with the SDMX model, different data flows can be constrained in different ways and published at different times to allow for differentiated access. The SDMX information model is a powerful and flexible solution for effective data management and sharing.

Data flows for controlling reporting & publishing

Figure 8 Data flows for controlling reporting & publishing

Data Democratization and Challenges in South Africa

Data democratisation is a global movement that seeks to make data more accessible to people outside of specialised fields. One important tool in achieving this goal is SDMX (Statistical Data and Metadata Exchange), which provides a standardised format for data and metadata and allows users to interact directly with the data, automate processes, and avoid tool lock-in. However, in the South African context, different providers often provide public domain data in varying forms, making it challenging to automate the availability of statistics. To make matters worse, loading data from individual spreadsheets with specific concepts can be slow and inconsistent due to its complex hierarchy.

Data Democratization

Figure 9 Data Democratization

EconData structures messy official statistics

Figure 10 EconData structures messy official statistics

EconData: Centralizing and Simplifying Access to Public Domain Data

EconData is a reliable source of public domain data in South Africa, providing easy access to time series of individual concepts at the industry level. Its centralised data definitions ensure definitional consistency and enable concept comparison. The data registry includes detailed metadata for each time series, making it easy to download the underlying data in the right format. Additionally, Econ Data applies quality checks to loaded data to ensure its validity. The SDMX structure used by Econ Data provides a codified process for data management. Overall, EconData simplifies accessing and downloading time series data, which would otherwise appear static and difficult to use.

EconData metadata

Figure 11 EconData metadata

EconData Registry

Figure 12 EconData Registry

Overview of Data Access and User Interaction

The significance of data as a service lies in its ability to offer various tools for users to access data seamlessly, making it adaptable to various South African economic processes. Econ Data is a platform that provides users with an API to retrieve data and update existing processes programmatically. Users can interact with the data using different tools compatible with SDMX-based data. While searching for time series with similar economic concepts is not currently available in Econ Data, the registry allows users to explore category schemes and statistical subject matter domains by clicking on them. The demonstration provides concrete examples of user interaction and data discovery using the registry. The Basel Regulatory Reporting category scheme, which primarily applies to banking regulations, is described in detail, and different categories within the scheme, such as the BA series, can be viewed. Their linked structures are machine-readable and discoverable programmatically.

Users can use variety of tools to access data

Figure 13 Users can use variety of tools to access data

EconData Registry Category Schemes

Figure 14 EconData Registry Category Schemes

Analysis of Different Versions of Data Sets and Data Flow

Byron Botha highlights the different versions of data sets available, focusing on the BA 100 version 1.2. The data flow provides various information, with different concepts related to it on the left-hand side. Dimensions of the data make up a time series with bank aggregation, line, and time period attributes, along with additional attributes and enumerated possible values known as code lists. The South African Reserve Bank provides the data set, with code list restrictions varying depending on the concept. The BA 100 only includes the total aggregation without individual institutions, and the correct version, BA 100 version 1, has restrictions applied to it. Reporting constraints are set up to enforce restrictions, and the restriction cross-references the data flow. It's important to note that the analysis may be confusing if the data is unavailable.

Analysis of Different Versions of Data Sets and Data Flow

Figure 15 Analysis of Different Versions of Data Sets and Data Flow

Figure 16 Analysis of Different Versions of Data Sets and Data Flow continued

Analysis of Different Versions of Data Sets and Data Flow continued

Figure 17 Analysis of Different Versions of Data Sets and Data Flow continued

Data Providers and Registry Structure in a High-Level Overview

A high-level overview of a system's data flow and provider scheme is provided, ensuring the data's accuracy and integrity. The scheme includes predefined and cross-referenced allowable data providers, validating all data against the schema and registry. The system can restrict views and apply various data restrictions, with a category screen offering an organised view of data flows different data providers, and underlying data structure definitions. The information model and standard used in the system offer flexibility and may potentially align with FIBO (Financial Industry Business Ontology), while the registry includes additional complexities and features. Overall, the system's data flow and provider scheme make it a reliable source of information.

Data Providers and Registry Structure in a High-Level Overview

Figure 18 Data Providers and Registry Structure in a High-Level Overview

Data discovery in the SDMX IM

Figure 19 Data discovery in the SDMX IM

Key Value Proposition of Econ Data and Its Benefits

The EconData management system offers centralised economic data with consistent definitions, accessible both on-premise and online. Users can view available data through the registry and interact with it using various tools. By adopting an SDMX work approach, users can enjoy data management, governance benefits, and user-level controls for specific data sets. The system also boasts interoperability benefits and data availability in different formats. Centralising economic data ensures a single source of truth, making it easier to trace processes and ensure governance. The system's availability from anywhere, especially in today's post-COVID world, addresses the challenges faced by large institutions.

EconData use case for Statistical Data Manager

Figure 20 EconData use case for Statistical Data Manager

The Benefits and Features of Data as a Service and the SDMX Data Model

Daan highlights the benefits of accessing advanced technologies and collaborating between institutions to deal with difficult challenges. Responsible data use, and a glossary of metadata and concepts enable seamless collaboration and governance. Decision-making becomes easier with the ability to ensure the latest data from specific sources. The platform allows for scheduled updating based on data suppliers' publication schedules, ensuring the latest data is available. Secure data handling is possible through role-based access governance and data management authority. Data creators can push data and perform quality checks, while specific provision agreements benefit governance. The SDMX data model includes managed governance and process management features. The platform offers usage statistics at an individual level and enables analytics to demonstrate the value of specific series and the overall system. The standard provides implicit and enabled quality checks for data accuracy.

Introducing Econ Data as a Data Provider for Balance Sheets

During the presentation, Howard expresses gratitude for the insights provided and offered the possibility of EconData becoming a data provider for the user's environment. EconData can provide valuable insights by comparing customer data to balance sheets within the industry. Daan emphasises the importance of trustable public data, such as banking partner data, which can serve as a baseline for comparison. Collaborative opportunities were mentioned, and the speaker plans to reach out for further discussion. South Africa has extensive bank balance sheet information available through the BA 900 regulatory form, which Econ Data compiles and makes easily accessible. Daan highlights the convenience of having this extensive data at one's fingertips, saving significant time and effort. Lastly, Tobias mentioned the important use case of using Econ Data to enrich internal data with metadata, which the buyer confirmed as feasible.

SDMX and Metadata in Data Sets

SDMX ISO is a flexible information model and registry that enhances structured statistical data with metadata capabilities. This feature allows for the application of the model and registry to various data implementations. SDMX also allows for data transformation with metadata and process, making it a powerful tool in the open metadata data catalogue environments market. Although there are competing standards, SDMX's features make it stand out as a valuable resource for managing and enriching data.

If you would like to join the discussion, please visit our community platform, the Data Professional Expedition.

Additionally, if you would like to be a guest speaker on a future webinar, kindly contact Debbie (social@modelwaresystems.com)

Don’t forget to join our exciting LinkedIn and Meetup data communities not to miss out!

Previous
Previous

Enabling Business People to perform Data Interrogation using AI, RAG & a Chatbot with Howard Diesel

Next
Next

Enhancing AI Initiatives with Ontologies and Controlled Vocabularies with Howard Diesel