Data Warehousing, BI, Big Data & Data Science for Data Citizens

Executive Summary

This webinar highlights the critical role of Data Literacy in organisations, emphasising the importance of effective communication and active listening in data processing. Howard Diesel takes explores the collaboration between business stakeholders and data professionals in Data Modelling, while addressing the complexities of data storytelling and the ethical considerations involved in Data Management.

The future of business applications and digital decision-making is examined along with the challenges of integrating AI in these contexts. Additionally, Howard takes time to examine the significant impact of Data Governance and quality on analytical outcomes. The webinar underscores the importance of developing competency frameworks in Data Management and assesses the business benefits and return on investment associated with AI initiatives.

Webinar Details:

Title: Data Warehousing, BI, Big Data and Data Science for Data Citizens

Date: 21 November 2024

Presenter: Howard Diesel

Meetup Group: African Data Management Community

Write-up Author: Howard Diesel

Contents

Data Literacy in Organisations

Data Literacy and Data Management

Data Literacy and Its Importance in Organisations

Data Modelling with Business People

Communication and Active Listening in Data Processing

Data Management and Data Analysis

Data Management and Communication Skills

The Art and Challenges of Data Storytelling

Effective Communication of Data Management and Ethical Considerations

Future of Business Applications and Digital Decision-Making

The Challenges of Applying AI in Business Applications

Impact of Data Governance and Data Quality on Analytical Outcomes

Data Governance and Data Engineering

Data Management and Preparation Strategies

Data Engineering and Decision Science

Implementing Data-Driven Decision Making in Business

Competencies Frameworks in Data Management

Business Benefit and Return on Investment in AI

Data Literacy in Organizations

Howard Diesel opens the webinar and asks the audience if any are part of an organisation that has Data Literacy programs in-place for new employees. An attendee shares that her organization has implemented an Enterprise Information Governance forum comprising senior managers, data stewards, and data owners to enhance Data Literacy across all departments. This initiative focuses on improving knowledge in Data Governance and privacy, particularly in a highly regulated environment. Monthly compulsory literacy training sessions, roadshows, and workshops on privacy and governance are offered to encourage participation and engagement. Employees are motivated to identify and report Data Quality issues, fostering a culture of awareness and continuous improvement in data handling practices.

Data & AI Competency Framework

Figure 1 Data & AI Competency Framework

Data Literacy and Data Management

The concept of a Data Literacy program is a program that encompasses various aspects of Data Management, such as Data Quality, privacy, and Governance. Howard shares that while the term "Data Literacy" is the umbrella term, there are challenges with public perception, particularly around the idea of literacy itself, as some business professionals feel it implies they struggle to understand data visualisations. Examples like box plots reveal gaps in understanding, highlighting that Data Literacy training should not only focus on visualisation skills but also encompass broader Data Management skills.

Howard shares an instance where concerns were raised by a business professional who felt that the emphasis on Data Literacy ignored the importance of business and financial literacy, suggesting that true understanding of data can only come when data professionals also grasp business language. This sentiment underscores the need for a holistic approach to data competency, which some refer to as "data abilities."

Data Literacy and Its Importance in Organisations

Data Literacy is increasingly recognised as essential for organisations. Howard notes that Data Literacy highlights the importance of understanding and working proficiently with data. It involves interpreting reports from business intelligence (BI), artificial intelligence (AI), and data visualisations. The importance of Data Literacy is that it emphasises bridging the gap between new generations of data-savvy individuals and those who may not have grown up with advanced data tools, fostering a culture of understanding and skill development.

Effective data culture requires assessing current competencies and addressing areas for improvement, such as Data Quality and Metadata. Organisations may utilise proficiency tests to identify skill levels among employees, enabling strategic allocation of resources to meet Data Strategy goals effectively. Overall, without a skilled workforce, even the best Data Strategies risk falling short.

Data Modelling with Business People

Effective communication about data and Data Management is crucial for engaging business stakeholders. It's important to summarise the significance of Data Modelling in simple terms, ideally in two to three sentences, to ensure that business professionals understand its importance. This involves distinguishing between various types of Data Modelling, such as dimensional, relational, and Data Vault modelling, and emphasising why acquiring these skills is beneficial for them. However, it seems that many have struggled to convince business colleagues of the value of learning Data Modelling, indicating a gap in communication or understanding that needs addressing.

Communication and Active Listening in Data Processing

Effective communication is crucial for understanding complex concepts, particularly in training and data discussions. If explaining a topic requires extensive detail, it often indicates a lack of understanding of the subject matter. Howard reflects that his own experiences have highlighted this. He shares another example of the challenge to articulating terms like "data provenance" versus "data lineage."

It's essential to refine communication skills to convey ideas clearly and succinctly, much like delivering an elevator pitch in 2-3 minutes. This includes writing data in an understandable manner, reading comprehensively, and engaging in active listening to facilitate informed decision-making. Ultimately, fostering clarity and incorporating feedback can significantly enhance both personal and group learning outcomes.

Data Management and Data Analysis

Howard moves on to the key principles of Data Management that enhance effective data analysis. He does this by highlighting the importance of clear communication in conveying concepts such as structured versus unstructured data and their implications for data engineering. Emphasis is then placed on the ability of consultants and transformation agents to explain complex technologies, including large language models and generative AI, to business stakeholders. Attendees are encouraged to assess their proficiency in these areas and provide feedback through a structured assessment available via QR code.

Where to find the Data and AI Proficiency Assessment

Figure 2 Where to find the Data and AI Proficiency Assessment

Proficiency Batch Questions

Figure 3 ‘Proficiency Batch’ Questions

Data Management Questions

Figure 4 'Data Management' Questions

Machine Learning Questions

Figure 5 'Machine Learning' Questions

Data Management and Communication Skills

The Proficiency Assessment emphasizes the importance of assessing our technical capabilities in delivering the Data Strategy effectively. It highlights the need for Data Literacy across all levels of personnel, not just executives and functional managers, acknowledging the roles of experienced data professionals.

The attendees then discuss their communication skills regarding Data Management topics and their comfort with emerging technologies like machine learning and generative AI. There is a shared understanding that continuous learning is essential, especially when navigating complex concepts such as data value realization and data ROI. Attendees are encouraged to develop the ability to interpret and communicate data insights effectively, which will enhance their overall proficiency in the field.

Data & AI Competency Framework

Figure 6 Data & AI Competency Framework

Key Personas

Figure 7 Key Personas

The Art and Challenges of Data Storytelling

Howard asks one of the attendees to highlight the challenges of making data visualisations understandable and impactful for a diverse audience. Dr. Daan Steenkamp emphasises the importance of industry knowledge in interpreting data and storytelling expertise in effectively communicating its significance. He notes that while dashboarding tools are improving, they do not eliminate the need for thoughtful storytelling strategies. Howard adds that effective communication in data storytelling draws parallels with movie scriptwriting, focusing on elements like general relevance, engaging openings, clear structure, relatable characters, conflict resolution, and emotional connections. Both discuss the necessity of interactivity in visualisations to encourage audience engagement and the importance of compelling narratives when addressing complex data sets, especially when proposing changes such as new Data Management initiatives.

Communicating Effectively

Figure 8 Communicating Effectively

Effective Communication of Data Management and Ethical Considerations

Effective communication in analytics requires an understanding of key concepts such as Data Management, data engineering, business intelligence, data science, machine learning, and decision science. It is essential to clarify the distinctions among these areas and their relevance to the business to ensure all stakeholders are aligned. Additionally, responsible analytics emphasizes the importance of ethics in data handling and the application of responsible AI.

Organizations must establish protocols to validate data usage and mitigate potential harm before implementation, rather than addressing issues retrospectively. This proactive approach is vital for fostering trust and ensuring the ethical use of data throughout the analytical process.

Communicating Effectively: Data Story Telling: General Relevance

Figure 9 Communicating Effectively: Data Story Telling: General Relevance

Communicating Effectively

Figure 10 Communicating Effectively

Future of Business Applications and Digital Decision-Making

An attendee mentions that analytical business applications are evolving to enhance the speed and efficiency of decision-making processes. A key development in this area is digital decision-making, which involves utilising machine learning models or robotic process automation (RPA) to make decisions autonomously. Additionally, it reduces the need for human intervention. It raises critical questions about the types of decisions that can be automated and the acceptable quality standards for these models, particularly concerning their rates of false positives and false negatives.

The Challenges of Applying AI in Business Applications

Howard highlights the challenges and considerations surrounding the use of AI in business applications, particularly in cases where chatbots have been misused, such as spreading inappropriate content or providing non-compliant advice, leading to legal consequences. He then emphasises the potential of analytical approaches to streamline processes, such as loan applications, by leveraging available data and reducing processing time from weeks to minutes. Two key types of analytics are discussed: confirmatory analytics, which verifies expected outcomes, and exploratory analytics, which investigates customer needs for new product development.

Communicating Data Management

Figure 11 Communicating Data Management

Impact of Data Governance and Data Quality on Analytical Outcomes

Data Governance and Data Quality play crucial roles in enhancing analytical outcomes, yet many people struggle to articulate their importance, particularly regarding Data Governance. While Data Quality is often more easily understood, the challenge lies in effectively communicating the benefits of Data Governance to business stakeholders. It is essential to explain how Data Governance can improve decision-making and add value to the business, addressing the common question of "What's in it for me?"

Howard shares that Data Governance is sometimes perceived as a hindrance by data scientists and engineers due to perceived delays in project timelines when implementing Data Quality rules. However, without these robust governance practices, the risk of working with unreliable data increases, ultimately compromising analytical results. To foster a stronger culture of Data Governance, it's important to embed these principles throughout the organisation and clearly articulate their benefits to all stakeholders.

Data Governance and Data Engineering

Howard highlights the importance of reliable Data Governance in enhancing analytical models created by data scientists. He shares that it emphasises the need for clear communication between Data Governance professionals and data scientists regarding the reliability of these models, establishing best practices rooted in authoritative frameworks such as ISO.

The necessary skills for building an effective analytical team, particularly in Data Management and engineering. Data normalisation is acknowledged as a critical concept, linking Data Modelling to data cleaning and preparation, underscoring its relevance despite perceptions that it has fallen out of favour in practical applications. Overall, the dialogue stresses the continual improvement of practices through monitoring and adaptation within the evolving landscape of data analytics.

Remaining Questions of Communicating Data Management

Figure 12 Remaining Questions of Communicating Data Management

Data Management and Preparation Strategies

Data preparation includes preparing and cleaning data while considering its purpose—whether for feature analysis or another goal. It focuses on selecting the appropriate data design method, such as dimensional modelling, anchor modelling, or a wide dataset in zero normal form, to facilitate effective data engineering.

Howard then touched on the challenge of communicating the significance of Data Management functions within organisations, especially since many lack a comprehensive understanding of their data landscape. He shared that he advocates for a balanced approach that incorporates essential design principles without overwhelming stakeholders, ensuring that governance aligns with agile practices.

Data Engineering and Decision Science

Effective decision-making must be based on data in data engineering and decision science. Howard then covers topics such as normalisation, data warehousing, business intelligence, and data science, highlighting again the need for clear communication between data scientists and decision makers regarding model performance, including issues like overfitting.

In the process of data preparation, the significance of cleaning, loading, and programming data cannot be overstated. Additionally, appropriate database management systems and data storage solutions. Howard then addresses the role of business analysts in decision modelling and the governance of decision-making processes. Lastly, a critical aspect discussed is the "Human in the Loop" concept, which examines when human intervention is necessary in automated decisions and the balance between efficiency and oversight in data-driven decision making.

Communicating Decision Science

Figure 13 Communicating Decision Science

Communicating Data Science

Figure 14 Communicating Data Science

Writing Data & AI Clearly

Figure 15 Writing Data & AI Clearly

Data Loading

Figure 16 Data Loading

Decision Modelling Notation Business Analysis

Figure 17 Decision Modelling Notation Business Analysis

Human In The Loop (HITL)

Figure 18 Human In The Loop (HITL)

Implementing Data-Driven Decision Making in Business

Howard outlines the essential questions related to key performance indicators (KPIs) that data scientists and business intelligence professionals should address to drive data-driven decision-making. These questions include analysing past performance over various periods, understanding the reasons behind current results, forecasting future outcomes concerning budget and resources, and identifying actionable strategies for improvement. Lastly, Howard emphasizes the importance of recognizing blind spots in data analysis.

DDDM

Figure 19 DDDM

Key Personas

Figure 20 Key Personas

Competencies Frameworks in Data Management

The competency framework is appropriate for assessing the capabilities of the Data Management office to support organisational decision-making and efficiency. An attendee expresses the challenges of effectively communicating the framework's importance to non-data professionals, emphasising the need for continuous learning and improvement within the organisation. He then adds that the framework is practical and can serve as a roadmap for growth. Another attendee acknowledged the value of self-awareness in context of the competency framework and expresses his intent to challenge himself through the framework.

Business Benefit and Return on Investment in AI

Howard then brings the webinar to a close by stressing the importance of clearly communicating the business benefits and return on investment (ROI) associated with AI initiatives. Despite ongoing efforts in Data Management and AI, there is a struggle to articulate the value these technologies bring to the business, particularly in demonstrating tangible outcomes. Stakeholders often perceive data and AI teams as merely experimenting with open-source models without understanding the true impact on business applications and insights.

If you would like to join the discussion, please visit our community platform, the Data Professional Expedition.

Additionally, if you would like to be a guest speaker on a future webinar, kindly contact Debbie (social@modelwaresystems.com)

Don’t forget to join our exciting LinkedIn and Meetup data communities not to miss out!

Previous
Previous

Data warehousing from Conceptual to Physical - Corné Potgieter

Next
Next

Data Storage and Operations for Data Management Professionals