Context is Everything with Remco Broekmans

Key Takeaways

  • Context is fundamental but variable: Data needs context; different stakeholders prioritise different attributes, like logistics focusing on dimensions and sales on pricing.
  • Structural separation ensures architectural agility: Separate core business concepts from their contexts for a flexible and resilient database structure.
  • Leverage ensemble modelling patterns: Core business identifiers are centralised in hubs, while links establish relationships and satellites isolate contextual data.
  • Avoid overly unified, complex directories: Organisations should maintain distinct contact lists for suppliers, employees, and customers to enhance efficiency.
  • Contextual categories are naturally bounded: For structured data like addresses, create distinct categories to simplify analysis and contextual descriptions.
  • Adhere to the cognitive Rule of Seven: Psychological research shows humans effectively process three to seven attributes; data models should reflect this.
  • Engineer purpose-built data marts: Architects should create specialised data marts for distinct user groups to enhance data accessibility and efficiency.
  • Exercise restraint with historical tracking mechanisms: Complex historical tracking is often excessive; users need simple, retrospective data over exhaustive histories.

Webinar Details

Title: Context is Everything with Remco Broekmans
Date: 09 May 2024
Presenter: Remco Broekmans
Meetup Group: INs & OUTs of Data Modelling
Write-up Author: Howard Diesel

How does Context Influence Data Quality Assessments?

Initiating international data maturity assessments provides a foundational benchmark for enterprise data management. However, conforming to a standard does not inherently guarantee data quality. This conceptual distinction introduces a broader architectural philosophy: the paramount importance of context.

To illustrate this, the presentation utilises an analogy involving a Smurf cartoon, demonstrating that terminology devoid of proper context is rendered meaningless. In data architecture, analysing raw information without a comprehensive understanding of its surrounding environment similarly yields ambiguous results.

Figure 1 Welcome to the International Data Maturity Assessment

Figure 2 Context is Everything

How does Context Affect the Value of Items?

The concept of context can be elucidated using a box of Finnish “Salmiakki” candy. Observers can readily identify its descriptive contextual attributes: a red container, vegan composition, a weight of forty grams, and specific manufacturer identification codes. Crucially, while the core physical object remains immutable, its contextual attributes are subject to temporal fluctuations.

For instance, retail prices may inflate, or inventory statuses may change to out-of-stock. Furthermore, a fundamental principle in data modelling is that specific context is not universally relevant simultaneously. Logistics personnel prioritise physical dimensions and weight for shipping calculations, whereas sales departments emphasise retail pricing.

Figure 3 About the Speaker

Figure 4 What about Context

Figure 5 What about Context pt.2

How does Agile Architecture Separate Core Business Concepts?

In agile data architecture, core business concepts—categorised as events, persons, things, or places—must be distinctly separated from their descriptive context and relational data. This architectural necessity is likened to the deconstruction of a culinary item, such as a burrito. In traditional third normal form relational modelling, an entity’s core identifiers and contextual descriptors are aggregated into a single table.

Consequently, executing structural alterations becomes highly inefficient, much like extracting a single ingredient from a tightly rolled tortilla. By deconstructing the data model and structurally isolating the core identifiers from the changing contextual attributes, database architects achieve a highly agile system that accommodates modifications without disrupting the foundational architecture.

Figure 6 What we Model

Figure 7 Deconstructed Burrito … Why we Separate

Figure 8 Deconstructed Burrito … Why we Separate pt.2

What is Structural Separation in Ensemble Modelling?

To implement this structural separation, ensemble modelling methodologies utilise distinct architectural components. The unchanging core business identifier is centralised within a “hub,” while relationships connecting these concepts are established through “links”. Both hubs and links represent immutable structural information. Conversely, all contextual and historical data—information highly susceptible to alteration—is exclusively isolated within a “satellite”.

Should new descriptive attributes emerge, architects can simply append new satellites without restructuring the core hub. The fundamental differentiator among various ensemble modelling patterns, such as Data Vault, Anchor modelling, and Focal Point modelling, resides primarily in their respective protocols for structurally isolating and managing these contextual attributes.

Figure 9 How we Separate in Data Vault

Figure 10 Forms of Ensemble Modelling

What are the Challenges in Modelling Addresses?

Managing complex data context is not an exclusively modern architectural challenge. Historical commerce, such as banking and logistics operations in the late nineteenth century, functioned efficiently because organisations recorded only the precise context required for specific operations. Prior to digital databases, professionals utilised physical address books, often segregating contact information into distinct, purpose-built lists for suppliers, customers, and employees.

When an individual needed to contact a supplier, they consulted that specific, specialised registry. While contemporary systems possess the capacity to consolidate all entities into a singular, unified directory tagged with multiple roles, this often generates operational inefficiencies. The requisite processing effort to load, filter, and interpret an amalgamation of heavily contextualised data often negates the perceived benefits of a unified view.

Figure 11 Ok, We Need to Store Context Separate but …

Figure 12 Let’s Look Back

Figure 13 Redundant Data

Is a Single Address Model Too Complex?

A frequent dilemma in data modelling involves the structuring of physical addresses. An address functions purely as context, detailing the location of a core business concept, such as an employee or a customer. Constructing a generic, abstract entity for all addresses and subsequently relating it to individuals introduces unnecessary complexity and ambiguity.

A more rigorous approach establishes direct relationships between specific contextual classifications and the core concept. This results in the creation of distinct data satellites for categories such as a “home address,” “work address,” “billing address,” or “delivery address”. Although generating multiple specialised satellites for a single customer may initially seem excessive, the spectrum of address typologies is inherently bounded. Analysts will rapidly exhaust the potential variations of contextual locational data.

Figure 14 Let’s Work on This

Figure 15 A & B

Figure 16 A, B & C

Figure 17 Splitting Context

Figure 18 Customer and Employee

How can Design Address Human Cognitive Limitations?

Architectural design must acknowledge the limitations of human cognition, a constraint codified in psychological research as the “Rule of Seven”. Decades of academic investigation indicate that individuals optimally process between three and seven attributes concurrently. Historically, stakeholders have demanded exhaustive reports featuring over two hundred columns to capture every granular detail of a transaction.

Such expansive formats exceed visual processing capacities and inevitably compel users to export the data into spreadsheet applications for practical analysis. To mitigate cognitive overload, data modelling should prioritise precision. Instead of engineering monolithic reporting structures, dimensional models should present a curated selection of three to seven highly relevant attributes specifically tailored to the analytical requirements of the end user.

Figure 19 Why Separation is not an Issue

Figure 20 Magic Number 7

Figure 21 Think about the Rule of 7 when Getting the Data out

How should Architects Design Specialised Data Marts?

Aligning with foundational dimensional modelling philosophies, architects are advised to construct specialised data marts catering to distinct user cohorts. For example, a marketing division analysing campaign efficacy requires fundamental metrics such as order dates and basic customer profiling, rendering comprehensive billing addresses superfluous.

Engineering concise, highly targeted dimensions yields data marts that are demonstrably faster, significantly more agile, and optimised for consumption. Furthermore, architects should exercise restraint when implementing complex historical tracking mechanisms, such as widespread Type 2 dimensions. User demand for historical analysis typically manifests as a straightforward fifteen-month retrospective on transactional data, rather than an operational need to review every historical modification to an entity’s descriptive context.

Figure 22 Some Improvements: Split for Marketing and Logistics

Figure 23 Or: Only Deliver what is Needed

Figure 24 Might lead to …

Figure 25 To Conclude

What Challenges Arise from Modelling Temporal Context?

During the concluding discussion, attendees scrutinised the enterprise application of the Rule of Seven. While certain attendees suggested that diverse stakeholder requirements often necessitate exceeding this cognitive threshold, the presenter emphasised that the paradigm is anchored in robust psychological research regarding human processing limitations.

Another inquiry addressed the complexities of modelling highly temporal context, specifically the shifting locational data of a hybrid employee who rotates between a corporate office and a remote environment. This architectural challenge is systematically resolved by recording the precise context aligned with a specific temporal timestamp. Because an individual is physically constrained from occupying multiple distinct geographical locations simultaneously, logging contextual data sequentially maintains both historical accuracy and relational integrity.

Figure 26 Upcoming Training

Figure 27 Links and Information

If you would like to join the discussion, please visit our community platform, the Data Professional Expedition.

Additionally, if you would like to watch the edited video on our YouTube please click here.

If you would like to be a guest speaker on a future webinar, kindly contact Debbie (social@modelwaresystems.com)

Don’t forget to join our exciting LinkedIn and Meetup data communities not to miss out!

Scroll to Top