Executive Summary
This webinar will examine the evolution of knowledge representation and data modelling, covering mind maps, domain-specific models, and advanced methodologies like Entity-Relationship and UML frameworks. It will highlight the role of ontologies in data integration and present a comparative analysis of modelling techniques, focusing on their features, tasks, and trade-offs. The session will also address model quality and bias considerations, concluding with insights into the significance of professional modelling practices and the resurgence of declarative approaches.
Webinar Details
Title: Strengths and Limitations of Different Types of Declarative Modelling Languages with Maria Keet
Date: 2024-08-05
Presenter: Maria Keet
Meetup Group: INs and OUTs of Data Modelling
Write-up Author: Howard Diesel
Introduction and Welcome to the Webinar
The webinar begins with Howard Diesel introducing Professor Maria Keet of the University of Cape Town, highlighting her interdisciplinary background in microbiology and computer science. This dual expertise is invaluable for understanding data modelling, as she completed not only a transition but also a full computer science degree and PhD after her microbiology specialisation.
Howard remarks on the natural synergy between these fields, particularly given the prevalence of RDF and taxonomies in biological research. Maria’s book, “The What and How of Modelling Information and Knowledge,” serves as the foundation for today’s presentation, offering both theoretical insights and practical guidance for data management professionals seeking to improve their modelling competencies.
Overview: From Mind Maps to Ontologies
Maria presents a structured journey through the landscape of declarative modelling languages, organising her presentation into five key sections: introducing different types of models, examining conceptual data models in detail, exploring ontologies as sophisticated modelling tools, assessing model quality and effectiveness, and drawing conclusions about professional practice. The presentation emphasises a gentle, mostly non-technical approach accessible to those unfamiliar with advanced modelling techniques, though occasional formulas appear with accompanying diagrams for clarity.
The framework spans a spectrum from simple brainstorming tools to highly expressive logic-based representations. Maria introduces a running example about rainfall—particularly apt given the winter season in South Africa—to illustrate concepts across different modelling approaches. She acknowledges that while her book covers both the “what” and “how” of modelling, today’s session focuses primarily on the “what,” helping participants understand when different modelling approaches are most appropriate.
Figure 1 Title of Presentation
Figure 2 Outline of Concepts covered in Presentation
Figure 3 “Models Galore”
Mind Maps and Domain-Specific Models: Initial Approaches
Mind maps represent the most accessible entry point for conceptual modelling, familiar to many from primary and secondary education. Maria demonstrates this with a colourful mind map about rainfall, showing how concepts radiate from a central idea with branches for incidents, water collection in ponds, runoff, roadside drainage, and stream flow. These diagrams are easy to create, visually appealing, and require no specialised knowledge—making them well-suited for initial brainstorming when capturing ideas quickly matters more than precision.
However, mind maps raise more questions than they answer when scrutinised critically. What constitutes a “good” mind map? What’s the appropriate size and scope? What do connecting lines actually mean? Research shows mixed results regarding their effectiveness, with usefulness primarily limited to primary school and early secondary education. As content complexity increases in senior secondary and university contexts, mind maps are insufficiently expressive to adequately capture textbook content.
Maria then transitions to domain-specific models in biology and hydrology that use more systematic notation. Biological cell diagrams use standardised icons for structures like mitochondria and the Golgi apparatus, while arrows indicate specific processes or movements. These models represent a significant step forward in precision, though they still lack computational support and proliferate incompatible notations across disciplines, creating steep learning curves and limited opportunities for integration.
Figure 4 Beginnings: Brainstorming about a Topic with Mind Maps
Figure 5 Mind Maps Raise More Questions than they Answer
Figure 6 Partial Solution: Biological Models
Figure 7 COVID-19
Figure 8 Physical Processes Involved in Runoff Generation
Figure 9 “Towards a Knowledge Graph”
Figure 10 Annotating Hydrological Models Example
Figure 11 Limitations of the Domain Models
Conceptual Data Models: Entity-Relationship and UML
Conceptual data models are a formalised approach for capturing information structures in software systems. These models employ entities (things of interest), attributes (properties), and relationships (connections between entities) using established notations, including Entity-Relationship diagrams (Crow’s Foot or Chen) and UML class diagrams. Maria illustrates this with practical examples: a book database that shows relationships among authors, books, and publishers, and an espresso machine model that captures components and their interactions.
These models enable precise specification of key constraints, including cardinality (one-to-many, many-to-many) and participation requirements (mandatory versus optional). For instance, the book-author relationship demonstrates many-to-many cardinality—Stephen King wrote multiple books, and “The Talisman” was co-authored by Stephen King and Peter Straub. The precision of conceptual data models makes them invaluable for database design and software development, providing clear specifications that can be semi-automatically converted into program code.
When Maria showed hydrologists a UML-style diagram of rainfall concepts, having explicit names and multiplicity constraints prompted them to recognise missing elements—they needed “runoff measurements” rather than just “runoff,” and different measurement systems with varying frequencies required representation. This demonstrates how formal modelling features encourage more complete and accurate specifications. However, these models typically serve specific applications rather than broader knowledge sharing, creating integration challenges when multiple systems need to work together.
Figure 12 Solutions to Limitations of the Domain Models
Figure 13 Conceptual Data Models
Figure 14 Conceptual Data Models: Examples of Language Elements
Figure 15 Conceptual Data Models: An Example
Figure 16 The Espresso Example of the Talk’s Announcement
Figure 17 “Rainfall, again” – UML Style
Ontologies: Logic-Based Knowledge Representation
Ontologies represent the most sophisticated approach to declarative modelling, addressing fundamental limitations of earlier approaches. Unlike application-specific conceptual data models, ontologies provide subject-domain models designed for use across multiple applications, facilitating reuse and integration. They employ formal logic—particularly description logic, a fragment of first-order logic—to provide precise, machine-processable representations of domain knowledge that eliminate ambiguity inherent in informal diagrams.
Maria notes that the Gene Ontology, the most widely used ontology, was established in 1998 and has grown to include thousands of entity types used across numerous applications and scientific papers. Ontologies appear visually as directed acyclic graphs; however, they are fundamentally text files that can be opened in standard editors. Tools such as Protégé provide graphical interfaces that, for example, show that a lion is an animal that eats only herbivores and at least one impala. Behind this readable interface lies formal logical notation.
The power of ontologies emerges through automated reasoning capabilities—reasoners can discover implicit information and detect contradictions automatically. In a simple example with classes A, B, C, D, E, and F, an automated reasoner deduces that class D must be a subclass of B based on their relationships, providing this inference “for free.” For models with thousands of classes, automated reasoning becomes invaluable for quality control and knowledge discovery, identifying relationships that human modellers might miss.
Figure 18 Limitations of Conceptual Data Model Models in Theory or Practise
Figure 19 Solving Limitations of Conceptual Data Model Models with Ontologies
Figure 20 Another Rendering of an Ontology & Behind the GUI
Figure 21 “… and Underlying that Serialisation”
Figure 22 “… and Underlying that Serialisation” pt.2
Figure 23 A note on Automated Reasoning – Illustration
Data Integration and Ontology Applications
Ontologies excel at data integration by providing common vocabularies and constraints that bridge multiple systems. Originally envisioned in the 1990s, ontologies sit atop various conceptual models—databases, object-oriented applications in Java or C++, and others—with each model having its own semantics. Rather than creating point-to-point connections between every pair of systems, the ontology provides a shared semantic layer. For instance, when one database uses “bloom” (Dutch for flower), and another uses “flower,” linking both to the ontology’s flower concept establishes that they reference the same thing, enabling seamless integration.
Contemporary applications extend far beyond the original data-integration purposes. Ontologies now support recommender systems, natural language processing, textbook enhancement, time savings in research, energy-optimised building control systems, and digital humanities projects. IBM’s Watson uses several ontologies to integrate different modules within its complex system architecture.
The current landscape features sophisticated orchestration in which domain ontologies connect to top-level ontologies, with components extracted or linked to support the development of conceptual data models for specific applications. Ontology-based data access systems use ontology components directly for conceptual queries. Intelligent textbook systems leverage ontologies to annotate text, automatically generate questions, and mark student responses.
Maria’s rainfall example demonstrates increased precision: when hydrologists reviewed the axiom “rainfall causes runoff,” they clarified that this isn’t necessarily true—rain may be absorbed entirely by soil. The ontology also helped distinguish whether “rainfall” refers to an event (a shower occurring) or to an amount of matter (the quantity of water that fell), a crucial semantic distinction.
Figure 24 Original Idea of an Ontology’s Use
Figure 25 Why Ontologies?
Comparative Analysis: Features, Tasks, and Trade-offs
Maria presents comprehensive comparisons to guide model selection, emphasising that the “best” modelling approach depends entirely on context. Feature-based comparison examines models across multiple dimensions: purpose, precision level, expressiveness, computational support, tool availability, learning curve, and standardisation. Mind maps excel at basic topic structuring with low precision but easy creation.
Domain-specific biological models offer moderate precision with systematic notation but proliferate incompatible tools. Conceptual data models offer higher precision for software development, with good tool support; however, they remain application specific. Ontologies deliver the highest precision and expressiveness, with robust computational support and quality-control mechanisms, though they require greater expertise.
Task-based comparison proves equally crucial for professional practice. For initial brainstorming, mind maps suffice perfectly. Documenting domain-specific scientific knowledge benefits from specialised domain models. Designing specific database applications calls for conceptual data models. Cross-application knowledge sharing and data integration demands ontologies.
Professional competence means recognising these distinctions rather than defaulting to familiar approaches regardless of context. When Maria analysed whether mind maps effectively support learning textbook content, she found conceptual data models offered the best trade-off between modelling ease and required expressiveness for undergraduate and postgraduate material. However, if critical inquiry and analysis—higher levels of Bloom’s taxonomy—are the goal, then ontologies are necessary despite their complexity. The key insight: match tool sophistication to task requirements rather than over-engineering or under-delivering.
Figure 26 Why Ontologies? pt.2
Figure 27 Orchestration of Ontologies and Applications
Figure 28 Rainfall: Sample Sketch and some Axioms for an Ontology
Figure 29 Limitations of Ontologies?
Figure 30 Feature-based Comparison
Figure 31 Task-based Comparison
Figure 32 Model Quality, for All Types
Model Quality and Bias Considerations
Beyond selecting appropriate modelling languages, professionals must grapple with deeper questions about model quality and potential bias. Syntax checking proves technically straightforward to implement, though not all tools provide this basic capability. However, answering the question “what makes a good model?” is far more challenging. Scientific experiments with human participants face inherent limitations—small sample sizes, numerous confounding factors, and difficulty controlling variables—thereby limiting the extent to which model quality can be assessed.
The most comprehensive theories, methods, and techniques for quality assurance exist for ontologies, with only some ported back to conceptual data models and virtually none available for biological models or mind maps. Maria raises the critical but often overlooked issue of bias in declarative models. While bias in data-driven AI systems receives considerable attention, declarative models aren’t immune to this problem.
Bias enters through conscious or unconscious modelling decisions during manual development. Property manipulation involves selecting which attributes to include or exclude—European air-conditioning calculations typically include equipment, insulation, and ceiling height, whereas US estimates may not, thereby affecting system recommendations. Setting permissible value ranges has consequential effects: redefining hypertension thresholds immediately changes the number of people who qualify for medication, benefiting pharmaceutical companies but affecting insurers.
Similarly, redefining alcohol use disorder criteria suddenly expanded the diagnosed population. Aggregation and granularity choices also embed perspectives—terrorism databases differ depending on whether modellers distinguish “government buildings” generally versus specific subtypes, thereby enabling or precluding distinctions some consider politically significant. Recognising and addressing bias requires critical reflection on modelling choices, diverse stakeholder involvement, and ongoing validation rather than treating models as objective truth.
Figure 33 Bias in Modelling?
Figure 34 Good Candidates for Incorporating Bias
Conclusions: Professional Modelling and the Resurgence of Declarative Approaches
Maria concludes with a powerful synthesis: there is no one-size-fits-all modelling approach, and professional competence requires understanding the entire spectrum of options and knowing which to apply for each situation. This nuanced perspective contrasts with dogmatic adherence to particular methodologies, recognising that different contexts demand different tools. Attempting to develop an ontology during active flooding would be irresponsible—brainstorming with mind maps enables the immediate solutions emergencies require, while ontology development demands careful thought and time.
An exciting trend emerges from the limitations of purely data-driven AI approaches: renewed interest in declarative modelling and neurosymbolic approaches that merge knowledge-based and statistical methods. Large Language Models and machine learning excel at pattern recognition but struggle with hallucinations, explainability, and the incorporation of domain expertise.
Declarative models, particularly ontologies, can ground AI systems by providing structured knowledge, enabling output verification, and ensuring alignment with domain understanding. This synergy works bidirectionally—ontologies guide and validate LLM outputs while LLMs assist ontology development by suggesting candidate concepts and relationships for expert validation.
During the final part of the session, attendees engaged in a Q&A and discussed using ontologies to constrain and validate ChatGPT and other generative AI tools, both guiding prompts and checking outputs against knowledge graphs to reduce hallucinations and improve accuracy. The combination of symbolic and statistical approaches is a promising approach to building more reliable, transparent, and trustworthy intelligent systems. Abundant opportunities remain to develop better tools, methods, and quality-control mechanisms, ensuring the field continues to evolve to meet emerging challenges in data management and artificial intelligence.
Figure 35 Recap and Final Remarks
Figure 36 Acknowledgements
Figure 37 Closing Slide
- Executive Summary
- Introduction and Welcome to the Webinar
- Overview: From Mind Maps to Ontologies
- Mind Maps and Domain-Specific Models: Initial Approaches
- Conceptual Data Models: Entity-Relationship and UML
- Ontologies: Logic-Based Knowledge Representation
- Data Integration and Ontology Applications
- Comparative Analysis: Features, Tasks, and Trade-offs
- Model Quality and Bias Considerations
- Conclusions: Professional Modelling and the Resurgence of Declarative Approaches
If you would like to join the discussion, please visit our community platform, the Data Professional Expedition.
Additionally, if you would like to watch the edited video on our YouTube please click here.
If you would like to be a guest speaker on a future webinar, kindly contact Debbie (social@modelwaresystems.com)
Don’t forget to join our exciting LinkedIn and Meetup data communities not to miss out!