From Monolith to Contract Driven-Mesh with Corné Potgieter

Executive Summary

This webinar presents a roadmap for the future of data interaction, covering the distinction between monolithic and mesh architectures, data contracts, and the shift from operational to analytical insights. Corné Potgieter outlines implementation approaches for Data Mesh and data product architecture, emphasises making data AI-ready, and includes a live demonstration of AI-driven data access with Entropy Data. The webinar concludes with a Q&A on implementation challenges and strategic reflections for future data management.

Webinar Details

Title: From Monolith to Contract Driven Mesh with Corné Potgieter
Date: 2026-02-02
Presenter: Corné Potgieter
Meetup Group: INs and OUTs of Data Modelling
Write-up Author: Howard Diesel

The Future of Data Interaction

Howard Diesel opens the webinar, welcomes participants, handles housekeeping, and introduces Corné Potgieter. Corné takes over the presentation to share a fictitious example of a CEO of an online toy store simply asking a large language model in plain English: “What are the most visited products on our website in the last two months?”

The LLM intelligently navigates the data catalogue, identifies the appropriate fact table (marketing leads metrics), and executes the query autonomously. It even interprets results. Corné mentions this works reliably 9 out of 10 times.

This demonstrates the ultimate objective: to make data products reliable, discoverable, and easily accessible to modern AI tools. The magic isn’t just the AI—it’s the robust data architecture underneath, built on data contracts and proper metadata management. This sets the stage for understanding why structured data architecture matters more than ever in the AI era, bridging the gap between business questions and technical execution through intelligent systems.

Figure 1 From Monolith to Contract-Driven Mesh

Figure 2 Prompt: Can you show me all the products that we sell online?

Figure 3 Claude presents the data in a table format

Meet Corné and Today’s Roadmap

Corné introduces himself as a freelance data architect with Sparkle, a Belgian consulting company, though based in Cape Town. He emphasises his pragmatic philosophy: striking a balance between purist ideals and practical implementation, while resisting absolute statements about architecture. Every scenario is case-by-case, depending on the organisational context. His expertise spans Data Vault, template-driven warehouse designs, and crucially, two long-running Data Mesh projects.

The agenda is comprehensive: foundational definitions (monolith, mesh, data contracts), a practical example of website analytics, implementation approaches with and without a Data Mesh, a deep dive into data contracts, and how everything fits into the modern AI landscape. Corné notes that this talk aligns well with the theme of making frameworks AI-ready, as much of the content addresses AI integration. The session will move from theory through practical examples to real-world challenges, providing both conceptual understanding and actionable insights from active implementations.

Figure 4 “A bit about me”

Figure 5 Agenda

Monolith vs. Mesh & Data Contracts Defined

The term “monolith” refers to a centralised data warehouse architecture in which a single team controls ingestion, modelling, transformations, and quality. Starting in IT, these systems grow into tightly coupled structures with lengthy change cycles, creating backlogs when business demands new features. Limited domain context emerges as the gap widens between business requirements and IT implementation. Scaling becomes problematic as more teams want AI workloads—from machine learning to AI agents—requiring rapid data availability.

Data Mesh offers contrast: a connected network of domain-owned data products built on four pillars—domain ownership (decentralisation), data as a product (product thinking), self-serve platforms (team autonomy), and federated governance (global policies, local implementation). Data contracts evolved from API specifications used in software engineering for decades.

The term gained popularity after Zhamak Dehghani’s 2019 introduction to Data Mesh, with pioneers such as Chad Sanderson and Andrew Jones promoting the concept around 2021-2023. The Open Data Contract Standard (Biddle project) has emerged as the potential industry standard, recently adopted by Collibra. These YAML files contain ~10 components covering fundamentals, schemas, and, crucially, data quality rules that formalise agreements between data consumers and producers.

Figure 6 Monolith to Mesh: Why Traditional Data Warehousing = “A Monolith”

Figure 7 Data Mesh: Not a Monolith

Figure 8 What are Data Contracts?

Figure 9 Data Contracts in Data Product Architecture

From Operational Data to Analytical Insights

When customers accept cookies, tracking begins. The system captures devices (mobile, laptop), browsers (Chrome, Edge), visitors (unique individuals), sessions (each visit), events (every action, including mouse hovers), and products (when viewing specific pages). A visitor can have multiple sessions, each generating numerous events. The marketing domain needs to track “marketing leads”—visitors from Google Ads who complete actions such as form submissions. The challenge is transforming this high-frequency operational data into analytical insights. In traditional Medallion Architecture, the Bronze layer stores raw data: nested JSON from Google Analytics and flat CSV files from Adobe Analytics.

The Silver layer combines these sources into cleaned, modelled, source-agnostic views using Data Vault, Third Normal Form, or custom approaches. The Gold layer publishes curated dimensional models (fact and dimension tables) for Power BI reports. While effective, the central team becomes a bottleneck because all domain requirements flow through a single group, creating the scaling problems characteristic of monolithic approaches.

Figure 10 The Example: Analysing website Traffic

Figure 11 Example: Website Analytics

Figure 12 Marketing Domain needs Marketing Leads

Figure 13 Marketing Domain needs Marketing Leads pt.2

Figure 14 Website Analytics: using a Medallion architecture

Figure 15 Website Analytics: using a Medallion architecture pt.2

Figure 16 Website Analytics: using a Medallion architecture pt.3

Data Mesh Implementation Approaches

Corné humorously notes how Data Mesh decisions often happen: management changes, perhaps the CDO was convinced at a conference, and suddenly, “we’re moving to Data Mesh.” While he believes there are legitimate benefits, it’s not for everyone and depends heavily on organisational culture. Data professionals often inherit these decisions and must work within existing constraints.

Two implementation patterns emerge. Option one builds foundational data products—reusable building blocks like “Website User Behaviour” (combining Google and Adobe data) and “Product Catalogue.” Consumer data products (Marketing Leads Metrics) then use these rather than rebuilding logic. This offers reusability and consistency but faces a critical challenge: ownership. Business domains resist owning foundational products because they don’t want to manage CI/CD pipelines and version control—precisely the work IT was originally created to do.

Option two allows consumers to rebuild from raw materials, accepting duplication to ensure autonomy. Most debates aren’t about architectural superiority but budget and resources. Teams often choose siloed products because they are faster and maintain autonomy, even though foundational products would be architecturally superior. The path of least resistance leads to decentralised chaos without proper governance and incentives.

Figure 17 But then … it all changes

Figure 18 Website Analytics – in a Data Mesh

Data Product Architecture & Contracts

Using a detailed diagram, Corné explains that domains own multiple data products, each with a type: source-aligned (raw, one-to-one with sources), foundational (reusable building blocks), or consumer-aligned (built for specific end users). Data products have input ports (how data enters) and output ports (how data is exposed). Crucially, data contracts govern output ports, not entire products.

For example, “Website User Behaviour” (owned by Customer Experience domain) has two input ports (Google Analytics API, Adobe raw feed) and three output ports (Snowflake tables, API endpoint, Iceberg tables)—each governed by a data contract. Under the Open Data Contract Standard, contracts include fundamentals (name, owner, description), complete schema specifications, and a data quality section where producers explicitly state the standards they’ll maintain.

Because contracts follow open standards, CLI tools exist for interaction—no custom development needed. External tools such as Soda, Monte Carlo, and Great Expectations automatically generate tests from contracts. Orchestration tools (DBT, SQLMesh) can verify incoming data against contracts before processing, failing pipelines if standards aren’t met. This transforms governance from documentation into automated, enforceable policies with clear responsibility chains for handling violations.

Figure 19 Website Analytics – in a Data Mesh pt.2

Figure 20 Foundational Data Product: Website User Behaviour

Figure 21 Foundational Data Product: Website User Behaviour pt.2

Figure 22 Consumer Data Product: Marketing Leads Conversion Metrics

Figure 23 Foundational Data Product: Website User Behaviour – Implementation

Figure 24 Foundational Data Product: Website User Behaviour – Implementation pt.2

Figure 25 Consumer Data Product: Marketing Leads – implementation

Figure 26 Focus on contracts: Data Contracts

Figure 27 Data Contracts

Figure 28 Data Contracts pt.2

Figure 29 Data Contracts: 1. Fundamentals

Figure 30 Data Contracts: 2. Schema

Figure 31 Data Contracts: 3. Data Quality

Making Data AI-Ready

Corné acknowledges uncertainty about AI’s evolution over 5-10 years but emphasizes substantial impact is coming. The question isn’t whether to prepare but how to structure data reliably for LLMs and AI agents. Data contracts provide essential infrastructure. When LLMs interact with organisational data, they need rich context to avoid “hallucinations”: which data products exist, what they contain, who has access, quality guarantees, freshness, and governance policies (GDPR and POPIA compliance).

With structured contracts, LLMs can query catalogues, understand schemas, check access rights, and generate accurate, compliant queries. Without contracts, LLMs create plausible but potentially incorrect queries or access data inappropriately. Corné emphasises that data architecture roles won’t disappear—they’ll evolve to make systems AI-ready.

Robust frameworks are more important than ever because autonomous tool interactions will increase dramatically. The architecture of AI interaction will change significantly, but the structure and frameworks remain crucial for maintaining control. As autonomous agents become more common, rich asset metadata ensures sufficient context for reliable, governed data interactions at scale.

Figure 32 CLI Tool

Live Demo with Entropy Data – Seeing AI-Driven Data Access in Action

Corné demonstrates Entropy Data’s tool, showing how LLMs interact with data catalogues governed by contracts. He deliberately removes his access permissions first to demonstrate governance. When asked about the most-visited products, the LLM searches the catalogue, identifies relevant data products (website user behaviour, marketing lead metrics), and recognises that while the marketing lead fact table would answer the question, the user lacks access to it.

The system automatically generates an access request that explains the query purpose. The data product owner (Corné in this demo) receives the request and can approve it, maintaining human oversight. After approval, the LLM executes the query and determines appropriate joins between fact and dimension tables based on standardised naming conventions and metadata. It provides interpretation and identifies top performers.

Corné notes that during preparation, when facing Snowflake integration issues, the LLM identified the problem and directed him to relevant documentation. The demo uses Entropy Data’s remote MCP server with prebuilt tools that interact with metadata and governance policies, creating audit logs for GDPR/POPIA compliance and tracking who requests data, why, and for how long—crucial for scaling governance in AI-driven environments.

Figure 33 LLM: How LLMs interact with data catalogues governed by contracts

Figure 34 How LLMs interact with data catalogues governed by contracts pt.2

Figure 35 Prompt: Which data products are available for me to browse?

Figure 36 Prompt: Can you do a query on top visited products?

Figure 37 Requesting and Denying LLMs Access to data

Figure 38 Viewing the request and access of data by LLM

Q&A: Key Implementation Challenges

An attendee asks who truly owns contracts—governance teams or product owners? Corné clarifies that the data product owner for the producer is accountable. They may have architects and engineers implementing details, but accountability rests with the owner. If they don’t understand this responsibility, organizations produce poor-quality products forever. Data products that function well have proactive owners in daily standups and planning—not just monthly governance meetings.

Another question concerns how foundational products differ from silver-layer tables. Corné explains that they can be technically identical—the difference lies in governance: decentralised ownership (not centralised IT), explicit contracts that enable autonomous use, and designated, accountable individuals. Howard questions table ownership when multiple products need the same data. Corné emphasises: one table, one owner. Consumers access through output ports with contracts—never directly connecting to tables.

This represents a behavioural change from traditional warehousing. An attendee raises concerns about decommissioning while consumers still rely on the products. Corné admits this is challenging—like any contract, discussions must occur, possibly transferring ownership to consumers. The root issue: producers often don’t understand obligations when approving access for 20+ teams. Building incentives for foundational products remains a key Data Mesh challenge—there are no natural incentives.

Final Reflections, Lessons, Warnings, and the Path Forward

Corné reflects on recurring challenges: organisations wanting unified enterprise views get stuck on who builds and owns foundational products. The path of least resistance leads to siloed consumer products, with teams rebuilding logic from raw sources. Without effort, maintaining connected views—through semantic links, central architecture teams, or reusable foundational products—organisations end up with a plethora of siloed products within years.

Medallion Architecture is a logical split, whereas Data Vault is a modelling technique; Data Mesh is an organisational operating model about ownership and accountability. Corné emphasises avoiding wrong comparisons: Data Mesh versus Medallion Architecture or Data Vault misses the point. They’re complementary, not competing.

The critical takeaway: always following the path of least resistance creates a different mess—decentralised instead of centralised. Success requires active architectural governance, incentive structures, and cultural commitment to shared standards. Howard then closed the webinar and thanked Corné, having covered substantial ground from theory to practical implementation challenges.

If you would like to watch the edited video on our YouTube channel, please click here.

Table Of Contents

Executive Summary
The Future of Data Interaction
Meet Corné and Today's Roadmap
Monolith vs. Mesh & Data Contracts Defined
From Operational Data to Analytical Insights
Data Mesh Implementation Approaches
Data Product Architecture & Contracts
Making Data AI-Ready
Live Demo with Entropy Data – Seeing AI-Driven Data Access in Action
Q&A: Key Implementation Challenges
Final Reflections, Lessons, Warnings, and the Path Forward