Can AI Replace Your Data Modeler with Remco Broekmans

Executive Summary

This webinar explores the intersection of generative AI and data management, focusing on AI’s impact on data modelling. Remco Broekmans covers the role of artificial intelligence in data modelling, addressing bias in AI implementation, business applications such as text analysis, and the use of knowledge graphs and ontology in data analysis and management. Lastly, the webinar highlights the challenges, ethical considerations, and future steps for leveraging AI to enable effective business data modelling.

Webinar Details

Title: Can AI Replace Your Data Modeler with Remco Broekmans
Date: 12 September 2024
Presenter: Remco Broekmans
Meetup Group: INs & OUTs of Data Modelling
Write-up Author: Howard Diesel

The Impact of Generative AI on Data Management and the Future of Data Modelling

As technologies like Power BI continue to evolve, they enable users to interact with data via verbal commands, further blurring the lines between structured data and natural language. This shift indicates that outdated, rigid data-structuring methodologies may soon become obsolete, paving the way for more dynamic and intuitive data management practices. By embracing these advancements, organisations can optimise their data processing capabilities and better navigate the complexities of their information landscapes.

Effective natural language processing relies heavily on data organisation, as demonstrated by a case study featuring the Fault Speed automation tool. The study revealed that a well-structured dataset, whether in the form of a comprehensive data lake or a unified table, is crucial for executing precise natural language queries. When data is denormalised into a single table, the accuracy of responses—such as retrieving user IDs or identifying top sales figures—significantly improves. Conversely, loosely coupled tables in a lakehouse environment lead to suboptimal outcomes, underscoring the benefits of well-organised data structures.

In addition to the structural organisation of data, Remco emphasised the importance of effective methodologies to connect existing data with models that facilitate natural language questioning. Despite advancements in comprehending business concepts, challenges in effective data modelling remain. The upcoming presentation aims to delve deeper into these challenges while proposing potential solutions and improvements. By further enhancing data organisation and processing, businesses can leverage natural language processing more effectively, ultimately leading to better decision-making and insights.

Intersection of AI and Data: A Data Modeller’s Perspective

Remco’s presentation delves into personal summer experiences that spotlight the significance of home renovation and the exploration of artificial intelligence (AI) in data management. Through captivating anecdotes, Remco illustrates the growing prominence of AI across sectors, emphasising the need to integrate this transformative technology into systems and discussions. By sharing insights from these experiences, Remco deepens understanding of AI’s impact and potential in our rapidly evolving digital landscape.

A key theme in the discussion is the importance of adapting to rapid technological advancements, particularly the integration of AI with data. Meaningful conversations with colleagues have helped shift initial scepticism into a profound appreciation for AI’s capabilities in their field. This transformation in perspective highlights the need for an open-minded and adaptable approach as technology continues to evolve, ensuring that individuals and organisations remain competitive and relevant in a dynamic environment.

In his role as a trainer, coach, and data modeller, Remco advocates for a cautious yet critical mindset regarding AI’s future impact. Drawing on his professional expertise and personal interests, he illustrates the importance of leveraging AI to develop effective data models for diverse business scenarios. Utilising the metaphor of a train journey, Remco describes himself as a traditional steam train, emphasising the need to adapt and keep pace with the bullet train of AI advancements to thrive in an ever-changing environment.

Remco serves as a leader in integrating AI tools, specifically highlighting the complexities of software like DALL-E for image generation. He acknowledges the challenges of establishing logical frameworks, business rules, and knowledge systems that align with user needs. By emphasising ChatGPT’s user-friendly interface and licensing advantages, Remco illustrates its suitability for a diverse range of presentations. Ultimately, his insights underscore the importance of a thoughtful approach to AI integration in the workforce, ensuring that both technology and users are effectively aligned for optimal performance.

Figure 1 Can AI Replace the Data Modeller?

Figure 2 Introduction

Figure 3 “My Journey”

Role of Artificial Intelligence in Assistance and Learning

In exploring various AI technologies, Remco found ChatGPT, developed by OpenAI, to be the most effective tool for his needs. Among the AI technologies he reviewed, including Gemini and Co-Pilots, ChatGPT stood out for its ability to help users answer questions and explain complex concepts. This interaction highlighted ChatGPT’s supportive role, rather than presenting it as a competitor in the workforce.

Remco emphasised the importance of the term “assisting,” which contrasts with the common narratives surrounding AI taking over jobs. By framing itself not as a superior entity but as a helpful companion, ChatGPT likened its function to that of a “toddler robot” capable of supporting users with simple tasks. This perspective reinforces the notion that AI can enhance human capabilities rather than replace them, paving the way for a more collaborative interaction between technology and users.

Figure 4 Let’s introduce AI (the ChatGPT Version) first

Challenges and Art of Data Modelling

Delvey’s exploration of creating visual representations of topics related to large language models (LLMs) uncovers both their potential and inherent challenges. By building on the work of Chetty and others, Remco envisioned a diverse collection of data points—referred to as “dots”—that would capture a wide range of information sources, including LinkedIn posts, books, and academic papers focused on data modelling. However, while the intention was to limit this output to 75 dots for clarity and coherence, the tool unexpectedly generated a larger dataset than anticipated, raising concerns about information overload.

This situation underscores the broader challenges LLMs face in creatively interpreting and presenting data. To ensure a structured, manageable output, I stressed the need to keep the dots below 0.5% of the total volume and reiterated my preference for a maximum of 75. Ultimately, while LLMs demonstrate impressive capabilities, their limitations in handling diverse data types highlight the need for careful curation and oversight of information representation.

Furthermore, the inherent creativity involved in data analysis and modelling is often overlooked, as LLMs adopt a straightforward input-output approach that lacks the nuanced understanding necessary for artistic processes. While there is an abundance of data available, the challenge lies in the lack of comprehensive views of the information, underscoring the need for a more creative and interpretative approach to utilising LLMs for effective data representation and analysis.

The field of data modelling encompasses a wide array of methodologies, each offering unique advantages and challenges. Practitioners must navigate various approaches, including Dimensional Models such as Type 2, Type 3, Type 4, and Type 7, specialised patterns such as Data Vault and 3rd Normal Form. The selection of a modelling pattern can profoundly influence the structural design and overall utility of the data system.

Given this diversity, it is essential for data professionals to thoroughly assess their specific requirements and contexts before determining the most suitable modelling strategy. While many resources may lean towards presenting the Entity-Relationship (ER) model as the conventional standard, a careful evaluation of individual project needs could uncover more effective alternatives. Ultimately, making informed decisions in data modelling can enhance data architecture and drive better business outcomes.

The evolving nature of Dimensional Modelling, particularly within the Data Vault methodology, reflects significant changes since its inception around 2000. Understanding both historical and contemporary modelling patterns is vital, as insights in this field may lag by about a year. Discussions with experts like Howard, Sundar Bynes, and Shane Gibson have highlighted the importance of staying current with these shifts and adapting to emerging best practices.

Creating an effective data model hinges on clear guidelines and a comprehensive understanding of stakeholders’ objectives. Data modellers must engage with stakeholders to identify goals, desired outcomes, and relevant business constraints; without this crucial information, developing a meaningful data model becomes increasingly difficult. Furthermore, it is essential to recognise that learning management systems often face challenges in interpreting and applying the provided guidelines, despite possessing access to extensive data resources.

Figure 5 This is the whole LLM

Figure 6 What is the Issue?

Balance and Execution of Lunar Landing Systems

The project centres on balancing creative requests with precise instructions, particularly for a robot designed for a bowling alley. In this instance, the individual specified an image of a robot bowling with data models rather than traditional bowling pins. Unfortunately, this request was misunderstood, resulting in the unintended inclusion of pins in the illustration. Additionally, although six lines were requested for the design, only five lines were delivered, highlighting the importance of adhering to guidelines and specifications.

To achieve the desired outcome, the individual reiterated their requirements in the hopes of aligning the final product with their initial vision. This situation underscores the importance of clear communication and attention to detail in creative projects, as deviations from the original request can lead to frustration and necessitate revisions. Ultimately, a collaborative approach that prioritises understanding can foster more successful creative endeavours.

Utilising language models (LLMs) trained on extensive information presents significant challenges that warrant careful consideration. Participants in the discussion emphasise the necessity of selecting relevant content, suggesting that a more focused approach to LLM training could enhance alignment with specific knowledge and perspectives.

There is a collective understanding that clear guidelines and instructions are essential for effectively managing the information used by LLMs, as this helps filter out irrelevant data. This conversation highlights the delicate balance between the vast breadth of available information and the precision needed for generating meaningful outputs.

To effectively prioritise objectives, it is crucial to concentrate on information that directly supports your goals while eliminating any irrelevant details. By honing in on targeted content, LLMs can deliver more accurate and relevant results. Moreover, there are additional considerations to explore, as I have prepared a presentation that will provide further insights and contextual understanding related to our discussion. This will help enhance our ability to apply these concepts in practice, ensuring that we maximise the potential of LLMs while minimising extraneous information.

Figure 7 Prompting Generative AI

Figure 8 “so how can we make sure the guidelines/ instructions are used?”

Understanding Data Modelling and AI: A Conversation on Identifiers, Context, and Relationships

In analysing a specific box pertinent to data modelling, it is crucial to extract vital identifiers and reference information that reflect its significance. Remco stressed the importance of structured information to effectively represent the box’s contents. He highlighted the need for details such as the name, additional specifications, and the creator’s reference associated with the box, pointing out that these elements are essential for clarity and accurate interpretation.

Additionally, when provided with an image description by an assistant, the response includes general insights about the items in the box, noting cultural influences that may affect the popularity of certain items, such as liquorice across regions. However, the speaker indicates a pressing need for clearer identification of the box itself, suggesting that the information currently available is insufficient and warrants further clarification for a comprehensive understanding.

The CDO IQ conference showcased the profound influence of generative AI on data management through engaging presentations by speakers Richard Wang and Professor John Talbot. Professor Talbot urged attendees to move beyond traditional attribute-value structures in data design, advocating a more intuitive approach that utilises advanced tools such as ChatGPT to enhance information organisation and visualisation. He underscored the crucial role of structured datasets, such as comprehensive data lakes, in improving the accuracy of natural language queries.

In addition to exploring innovative data processing approaches, Remco shared his personal experiences in integrating AI into data modelling. He highlighted the importance of adapting to AI advancements and how tools such as Chat GPT can facilitate this transition. Remco then addressed the challenges of effectively leveraging AI for data modelling, while offering insights on creating compelling visual representations of information in this rapidly evolving landscape. Overall, the conference emphasised the transformative potential of generative AI in shaping the future of data management.

Describing items such as boxes or products from an objective perspective is essential for effective business reporting. This approach ensures that all relevant identifiers and data are collected, enabling accurate product evaluations. By maintaining focus on objective descriptions, businesses can create a solid foundation for informed decision-making and analysis.

Additionally, clear communication of objectives and feedback is vital when interacting with AI or data systems. Providing effective feedback not only improves the quality of the responses received but also refines the AI’s understanding over time. Ultimately, this collaborative approach enhances the relevance and utility of the information provided, leading to more productive interactions and business outcomes.

The enhancement of large language models (LLMs) relies significantly on the establishment of clear guidelines and boundaries, which are essential for effective data modelling. A primary focus must be placed on identifying core business concepts within organisations and categorising them into two main types: event-based and person-based. This approach goes beyond simple categorisation; it emphasises understanding the relationships among these concepts and their historical context, which is crucial for achieving a comprehensive data model.

Recognising the dynamic nature of relationships within data is also vital, as these connections evolve over time. Whether analysing three-dimensional data or other formats, the underlying principles remain the same: a focus on business concepts, context, and relationships. Ultimately, acknowledging and prioritising these three core components is essential for simplifying data analysis and improving the overall understanding of the data landscape.

Figure 9 Intermezzo – Difference between Human POV and ChatGPT (LLM)

Figure 10 What is the Human point of view?

Figure 11 What does Chat GPT4o say about:

Figure 12 What does Chat GPT4o say about pt.2

Figure 13 Who is right?

Figure 14 What we model / how humans think

Figure 15 How do we separate in Data Vault

Business Models and Data Modelling in AI

Remco articulates a critical perspective on the integration of AI in the workforce. He drew a compelling analogy between traditional steam trains and the rapid advance of AI, suggesting that society must urgently adapt to these technological changes. Through his exploration of various AI tools, he highlights ChatGPT by OpenAI as particularly effective, valuing its role in assisting with tasks rather than replacing human jobs.

Despite the promise of AI, Remco raises concerns about the challenges associated with large language models (LLMs), particularly in navigating and retrieving specific information. He argues that data modelling requires a level of creativity that current AI technologies struggle to provide. Moreover, they emphasise the need to understand both historical and contemporary data modelling methodologies, acknowledging their complexity and contradictions. As such, Remco advocates for clear guidelines and strategic objectives, emphasising the importance of a thoughtful approach to leveraging AI effectively.

When developing a data model, it is crucial to grasp the specific goals and requirements driving the request. This understanding goes beyond simply delivering a model; it entails asking important questions about the desired outcomes and intended applications. For instance, determining whether the model is meant for rapid reporting or data science can significantly influence its design and structure.

Inadequate comprehension of these objectives often leads to issues, such as incomplete or incorrect models generated by tools like ChatGPT, which may lack essential details and context. By clarifying the objectives upfront, developers can create data models that effectively meet users’ needs and ensure that the final product aligns with their expectations. This careful approach ultimately enhances the utility and effectiveness of the data model.

A critical evaluation of data structures is essential, especially when dealing with payment information. The discussion highlights a significant concern about including metrics within dimensions, advocating instead that they be placed in a fact table to ensure clarity and accuracy. This adaptable approach to data modelling underscores the importance of setting boundaries and scrutinising the provided information to maintain data integrity. Specifically, in a Dimensional Model, it is crucial to include payments in the fact table, as these amounts serve as vital metrics.

Moreover, as experienced professionals, we have a responsibility to mentor newcomers in the field. Providing constructive feedback not only acknowledges their efforts but also fosters a supportive learning environment, encouraging growth and improvement. By guiding less experienced colleagues through their projects, we enhance their skills, ultimately benefiting future initiatives and fostering a collaborative atmosphere.

Figure 16 How do we separate in Data Vault pt.2

Figure 17 Let’s Model a Business Case

Figure 18 The Basic – Just ask ChatGPT4o – per September 2024

Figure 19 And a Dimensional Model

Data Modelling and Bias in AI Implementation

The feedback on the model data provided was mixed, revealing both strengths and weaknesses in its design. While the pleasing colour scheme and the smooth transition from satellite to link were appreciated, concerns were raised about the representation of relationships. The choice of a green box for links and a yellow circle for contacts was viewed as overly technical, suggesting a lack of consideration for broader data modelling practices, particularly in relation to linking temporal logic (LTL).

In conclusion, while the initial effort was acknowledged as partially successful, it highlighted significant areas that require improvement to align the model with data modelling best practices. Enhancing the representation of relationships and adopting a more intuitive approach are essential steps that could substantially elevate the model’s overall effectiveness. These adjustments will enable future iterations to better meet the needs of users and stakeholders.

To refine the business model further, it is crucial to clarify terminology and improve the relationships within its framework. For example, replacing the term “appointment” with “grooming” more accurately captures the context of providing assistance. Additionally, addressing previously overlooked aspects, such as incorporating payments into relationships, underscores the need for clear parameters to avoid implementation oversights. By establishing these boundaries, the transition from traditional business cases to AI-driven data models can proceed more smoothly.

In addition, Remco stresses the significance of incremental modelling, especially concerning bias in language learning models (LLMs). A recent presentation by Marsha Mueller underscored the risks of bias but also proposed that, when applied thoughtfully, bias could guide models toward desired outcomes. By using specific business cases as prompts, the presenter effectively identified core concepts and categorised them, yielding actionable insights such as event scheduling and clear definitions. This method demonstrates that a deliberate approach to bias can facilitate the extraction of relevant information while maintaining model clarity.

Figure 20 The Human POV

Figure 21 And a Data Vault Model

Figure 22 The Human POV pt.2

Figure 23 Do we need Improvement?

Figure 24 Starting Point: Provide BIAS Before the Case Description

Role of AI in Business Writing and Feedback

Remco has developed a comprehensive guide to dog grooming. This user-friendly PDF combines detailed descriptions of various artefacts used in dog grooming with practical examples that enhance understanding. To further enrich the content, the guide includes three business cases that illustrate effective teaching strategies for core business concepts, complete with clear definitions and relevant categories.

In his concluding section, Remco effectively demonstrates how a structured approach equips readers for further training in a professional environment. By providing a comprehensive analysis of drop-off appointments at a dog grooming store, he highlights the roles of key individuals, such as the grooming specialist and manager. This detailed examination not only clarifies their relationships but also enhances the understanding of dog grooming concepts and their practical business applications.

Furthermore, in collaboration with an assistant, Remco developed a comprehensive model that illustrates the connection between employees and ownership while simplifying the presentation by excluding unnecessary store details. This joint effort resulted in a refined structure that preserved the core elements of the original model while integrating valuable insights and improving overall accuracy. Ultimately, Remco emphasises the critical importance of teamwork and communication in enhancing operational efficiency.

The updated model comes with essential guidelines and boundaries, elevating its clarity and usability beyond what was initially proposed. Ultimately, this meticulous refinement aligns the model more closely with human perspectives, offering a deeper understanding of the dynamics at play and ensuring that critical factors are thoroughly addressed.

Figure 25 Testing Phase Using Specific GPT to find the Right Business Terms

Figure 26 Extend Documentation in the Prompt and Add Learning

Figure 27 Core Business Concepts ChatGPT with Extended Prompt

Creation and Use of Data Modelling Learning Structures

Creating a personalised large language model (LLM) can significantly enhance productivity and tailor experiences to meet specific needs. Unlike existing models from platforms such as Trump or Jet TPT, developing your own LLM lets you incorporate unique knowledge and capabilities. By utilising platforms like Gemini, you can create custom gems that directly address your requirements. Support from knowledgeable colleagues like Sandra Obanche further helps guide the development process and instructs your LLM to assist with a variety of tasks.

Ultimately, building a personalised LLM not only empowers you to achieve desired outcomes but also streamlines workflows in ways that generic models cannot. This tailored approach ensures that the model captures the nuances of your specific context, making it a valuable tool for enhancing efficiency and effectiveness in your work. Through collaboration and careful planning, you can create a powerful asset that meets your unique demands.

The future impact of AI on the workforce remains a topic of scepticism and debate, particularly regarding its role in task assistance and job displacement. A trainer based in Denmark, with roots in the Netherlands, highlights the effectiveness of tools like OpenAI’s ChatGPT for enhancing task efficiency rather than replacing jobs. They point out that while large language models may excel at language generation, they often fall short in addressing the creative intricacies of data modelling. Understanding specific goals and requirements is crucial when developing these models, as clarity and context are essential to avoid incomplete structures.

Emphasising the need for an adaptable approach, particularly in payment information contexts, Remco advocates for mentoring newcomers by providing constructive feedback to foster growth. He also emphasises the importance of refining business models to improve terminology and relationships, while acknowledging the risks and potential biases inherent in language-learning models. By modelling in incremental steps, organisations can ensure clarity and effectiveness in their data modelling efforts, ultimately paving the way for more informed decision-making in the ever-evolving technology landscape.

The importance of providing transparent knowledge sources on large language models (LLMs) cannot be overstated, especially for addressing user concerns about the validity of their responses. By offering access to relevant documents and resources, users can gain confidence in the information provided, knowing the underlying data that informs these models. However, this transparency must be balanced with the need to avoid overwhelming users with excessive information, which can lead to confusion.

Creating a generic data modelling LLM presents its own challenges, particularly due to the specificity required for effective responses. As users often seek guidance on complex topics such as denormalised data models or the transition between data modelling forms, such as normalised to dimensional structures, careful consideration must be given to how inquiries are addressed. Ultimately, striking the right balance between transparency and clarity will enhance user trust and facilitate a better understanding of these sophisticated technologies.

To develop an effective data model, it is crucial to focus on identifying core business concepts and their relationships rather than simply requesting a data model. This approach not only involves creating additional training sets but also emphasises the importance of collaboration to gather valuable insights. Employing the LV method can help clarify our objectives, including the coefficients, concepts, and contextual information, laying a solid foundation for constructing a comprehensive data model.

In this process, proceeding in small, manageable steps is vital, starting with defining the core business concepts before exploring their relationships. Continuous feedback on the scoring of data elements ensures alignment with our event canvas, fostering a more organised developmental approach.

Additionally, effective communication and tailored feedback are essential when guiding junior developers, as each individual’s background may influence their understanding. By harmonising our strategies across the team, we can create a cohesive learning environment that promotes growth while maintaining control over the development process.

Figure 28 The Model

Figure 29 Have a Data Modelling GPT / LLM

Future Steps and Achievements in Data Modelling and Ontology

The development of an innovative approach is marked by several key initiatives aimed at enhancing our presentation and framework. Remco is currently finalising a presentation to gather valuable feedback while also improving our framework with additional guardrails for greater clarity and direction. His dedication extends to authoring a book that will showcase our unique methodology, further solidifying our approach in the field. Additionally, the excitement surrounding the upcoming update to Hans’s 2012 Data Vault book heightens interest in our ongoing initiatives, underscoring the potential impact of our work.

Figure 30 What I am learning and where I am in my Journey

Understanding and Addressing Bias in Data Analysis and Ethical Considerations

The discussion surrounding language models (LLMs) emphasises the significant biases that arise from their reliance on internet data, which has predominantly been shaped by a white male demographic. This reliance on a narrow range of perspectives leads to a skewed representation of knowledge, underscoring the need for users to acknowledge these inherent biases. By recognising the limitations of LLMs, individuals can better navigate the information provided and seek alternatives that offer a broader range of viewpoints.

Ultimately, fostering transparency in LLM operations is essential to ensuring ethical engagement with these technologies. Users must be informed about the sources that shape the models’ outputs, thereby building trust and promoting a deeper understanding of the content generated. By advocating greater openness, we can develop models that reflect a broader spectrum of experiences and knowledge, enriching overall interactions with language models.

Figure 31 Next Steps

Role of AI in Text Analysis and Business Data Modelling

Understanding the limitations of language models (LLMs) is crucial, as these systems often prioritise delivering answers over ensuring their accuracy. For instance, when asked about the role of stones in a healthy diet, an LLM might suggest that consuming stones is unhealthy, despite contrary evidence.

This inclination to provide an answer can lead to significant inaccuracies, reminiscent of a popular Dutch song that humorously miscalculates mathematical facts. Therefore, it is essential for users to approach information generated by LLMs with critical thinking and to independently verify facts.

Additionally, the complexities of AI language processing deserve recognition, especially regarding how models like ChatGPT analyse text. Rather than examining individual letters, these models consider tokens—groups of letters—within a broader context. An illustrative example of this is when a request to define a term without the letter ‘E’ still results in the inclusion of ‘E’ in the response.

This demonstrates that AI interprets language through patterns rather than isolated characters. Furthermore, leveraging AI for industry-specific terminology could enhance communication and comprehension in business settings, highlighting the potential future applications of such technology.

Remco has taken a proactive approach to leveraging innovative AI technologies to enhance business insights. By focusing on the automation of information retrieval, such as searching through emails and documents or converting spoken discussions into text for analysis, Remco then highlights how these tools can facilitate better understanding and collaboration within a business environment.

Moreover, the significance of iterative feedback is underscored as a vital element in improving AI performance, reflecting Remco’s dedication to ongoing exploration in this field, particularly during their downtime from renovation work. Ultimately, these efforts illustrate a commitment to harnessing AI’s potential to drive business efficiency and effectiveness.

In addition, the conversation delves into the application of large language models (LLMs) for data modelling, with a focus on augmenting the data modelling process rather than replacing traditional data modellers. A participant expresses a keen interest in further exploring this topic despite being busy with client projects, indicating a willingness to engage in collaborative discussions and share insights. The dialogue also touches on future opportunities, including a potential speaking engagement at an upcoming conference in Calgary, signalling an ongoing commitment to exploring and advancing AI applications in business.

An attendee then emphasised the importance of integrating ontologies with large language models (LLMs) to enhance their ability to address critical biases and provide context. They referenced a presentation outlining a comprehensive process for building client-tailored LLMs that leverages a business glossary and relevant terminology through an ontological lens. This approach not only ensures a more precise alignment with client needs but also enriches the model’s understanding of key concepts.

Moreover, an attendee introduces the innovative concept of Retrieval-Augmented Generation (RAG), which facilitates the continuous incorporation of authoritative knowledge beyond the initial training set. This capability allows LLMs to evolve and adapt over time, promoting ongoing learning and enhancing their performance. By integrating these strategies, organisations can create more effective and contextually aware language models that remain relevant in a rapidly changing information landscape.

Figure 32 Some things to Think about

Figure 33 More thoughts

Use and Management of Knowledge Graphs in Business Analysis

A knowledge graph tailored for business analysis plays a crucial role in categorising data into three primary contexts: industry, enterprise, and project-specific frameworks. This structured approach not only enhances the usability of the graph but also supports the continuous integration of business cases and knowledge nodes into a language model (LLM). By effectively managing inputs, validating outputs, and providing contextual information, knowledge graphs ensure results are both relevant and accurate.

Research highlights three significant ways to leverage knowledge graphs. Firstly, they filter out irrelevant questions based on a defined ontology, streamlining the input process. Secondly, they validate the LLM’s outputs by checking them against the established ontology, thereby enabling the identification of knowledge gaps.

Finally, the contextual information provided by the knowledge graph helps maintain the relevance of results, enabling the LLM to understand and correct any inaccuracies in its outputs. Ultimately, this methodology enhances the overall effectiveness of business analysis.

Role and Challenges of AI in Business Data Modelling

Remco shared that he is currently undertaking a project focused on the configuration of rack systems and device sensing, emphasising a personal approach that promotes understanding. While they recognise that others may possess more advanced methodologies, they are committed to documenting each step of their business concept, which involves categorising data and establishing satellite links. This process is crucial as it lays the groundwork for a robust relational data model.

Central to their concept is the integration of components, particularly during sales events when customers engage with multiple products. By creating a structured relationship that includes a sales header and corresponding detail lines, the speaker aims to clarify transaction processes and enhance data organisation.

Analysing example records has revealed that customers often purchase multiple products in a single event, highlighting the necessity for this clear relational structure. Ultimately, this approach not only streamlines data management but also improves the efficiency of generating relevant test sets for sales events.

The recent sale highlighted significant challenges in identifying discounted products, revealing flaws in the system’s information clarity. Despite every product being on discount, the lack of clear communication likely hindered user experience and decision-making. This situation underscores the importance of implementing cautious automation processes, as the distinction between a model’s theoretical outputs and practical implementations is crucial. It emphasises that a critical human decision point is necessary to determine the acceptability of the information provided, a vital consideration when developing a business case for a data model.

The discussion also draws parallels between software prototyping and data model refinement, illustrating the complexities of transitioning from prototypes to production-ready products. Effective teaching and mentoring are essential for helping junior team members understand the interconnections among concepts. Additionally, it is important to recognise that large language models (LLMs) learn in ways similar to humans, underscoring the need for guidance and constraints through descriptive logic. Ultimately, it is hoped that these models will advance and evolve faster than human development, leading to improved outcomes across various applications.

The rise of large language models (LLMs) marks a pivotal development in computing technology, yet it falls short of representing true intelligence. Unlike traditional expert systems that aimed to encapsulate human knowledge within machines, LLMs are driven primarily by enhanced computational power, increased storage capacity, and superior connectivity.

This reliance on technological advancements rather than genuine understanding means that, despite their impressive capabilities, LLMs operate through techniques such as vectorisation and pattern recognition. As a result, their outputs—such as summaries—often miss nuances of meaning, simply eliminating irrelevant information without comprehending the underlying content.

Understanding this distinction is vital for organisations that expect authentic intelligence from AI. LLMs, while advanced, do not possess the deep comprehension needed for true cognitive functioning. As such, it is imperative for businesses to temper their expectations regarding AI technologies and recognise the limitations of LLMs. By acknowledging that these systems excel at processing information but lack the ability to fully understand it, organisations can make more informed decisions about how to leverage AI effectively in their operations.

Figure 34 Links and Information

Use of Knowledge Graph and Ontology in Data Analysis and Management

In data analysis, integrating advanced tools such as knowledge graphs, ontologies, and AI language models (LLMs) is proving transformative. For instance, an attendee highlighted the potential of knowledge graphs to reveal hidden patterns and connections within large datasets, demonstrating how these tools not only organise existing information but also infer new relationships based on established rules.

Complementing this perspective, Remco shared his discussion with Shane Gibson in New Zealand, which revealed his innovative use of AI LLMs for data profiling, particularly in identifying missing business rules by exploring legacy systems. This approach underscores the increasingly pivotal role of AI in enhancing data comprehension and unearthing valuable insights.

While LLMs’ capabilities in data analysis are impressive, they still have limitations in creative processes. For example, LLMs can efficiently assist in error handling and uncovering metadata for ambiguous column names, but they struggle to engage in more creative data interpretation. Remco shared his belief that while LLMs excel at discovering rules and offering solutions based on existing knowledge, the ability to perform creative analysis still requires further development. Therefore, as we leverage these technologies, it’s essential to recognise their strengths and weaknesses to optimise their utility in data-driven decision-making.

Table Of Contents

Executive Summary
The Impact of Generative AI on Data Management and the Future of Data Modelling
Intersection of AI and Data: A Data Modeller's Perspective
Role of Artificial Intelligence in Assistance and Learning
Challenges and Art of Data Modelling
Balance and Execution of Lunar Landing Systems
Understanding Data Modelling and AI: A Conversation on Identifiers, Context, and Relationships
Business Models and Data Modelling in AI
Data Modelling and Bias in AI Implementation
Role of AI in Business Writing and Feedback
Creation and Use of Data Modelling Learning Structures
Future Steps and Achievements in Data Modelling and Ontology
Understanding and Addressing Bias in Data Analysis and Ethical Considerations
Role of AI in Text Analysis and Business Data Modelling
Use and Management of Knowledge Graphs in Business Analysis
Role and Challenges of AI in Business Data Modelling
Use of Knowledge Graph and Ontology in Data Analysis and Management

If you would like to join the discussion, please visit our community platform, the Data Professional Expedition.

Additionally, if you would like to watch the edited video on our YouTube please click here.

If you would like to be a guest speaker on a future webinar, kindly contact Debbie (social@modelwaresystems.com)

Don’t forget to join our exciting LinkedIn and Meetup data communities not to miss out!