Modeling Many to Many Relationships in a Star Schema

ByAdam Gilmore Updated onJuly 19, 2024

Many to Many relationships are defined in a Star Schema Dimensional model using a “Group” Dimension and a “Bridge” Fact. Take the example below:

In the above example, an insurance company receives medical claims. The medical claims are the business event represented by the Fact table. Each medical claim can have more than one associated diagnosis. These groups of associated diagnoses are modelled in a “Group” dimension. Individual diagnoses are associated with the “Group” Dimension via a “Bridge” fact. The “Bridge” fact makes the association between the “Group” and individual diagnosis.

In this scenario, the most challenging thing to build is the “Group”. It doesn’t exist in your source and must be derived. To do this, you need to select all combinations of diagnoses that exist in medical claims. The source of the target fact will usually be the source of your “Group” as well. For example, Let’s say you had these groups of diagnoses that are used on claims.

Then, there would be one row in the Group Dimension for each Group. In fact, the concatenation of diagnosis keys is an excellent candidate to use as the business key of the Dimension, i.e. DiagnosisGroup_Id. You may also want to include a concatenation of the diagnosis names as an attribute if you want to analyze the impact of certain diagnosis combinations. For example, by choosing the “C,D” “Group” Dimension member, an analyst can quickly get the claims measures for claims with both C and D, but only C & D diagnoses.

The Bridge is simply another Fact, usually a Fact-less Fact, that just records the existence of individual diagnoses within the group. However, it is possible to put measures on this Bridge. For example, a ratio of the contribution of each individual Diagnosis in the Group to the overall value of the claim. A measure on a bridge is usually some ratio between the individuals. Another example is the ratio of ownership of joint buyers on a Sale.

A Group isn’t always derived, as shown in the example below:

In the example above, a Bank Account has joint Account Holders (Customers). There is no need to derive the Group as it exists as an entity in your source.

Another example might be if you had a Fact that represented the process of a Customer viewing a show on a streaming service. A show is related to many Actors, but the Group is readily available in the source system as a Cast entity that already groups the Actors. In this case, the “Cast” entity can be used as the source of the “Group” Dimension.

Why are Many to Many relationships modelled this way? One reason is that OLAP Cubes and other semantic layer software recognize this format and faithfully produce the correct aggregations when analyzing the Fact by the Individual Dimension. For example, if you want to sum the Medical Claim amounts for a given Diagnosis, you would select the Diagnosis from the Diagnosis Dimension, drag in your Claim Amount measure, and the semantic software would navigate the Bridge and Group relationship to produce the Total Claim Amount for that one Diagnosis.

Adam Gilmore

Adam Gilmore is a Data Architect with 20 years' experience in the Data field. He writes for dimodelo.com and develops courses covering Data Analysis to Data Architecture.

Data Warehouse | Persistent Staging Layer

Persistent Staging Case Study – Payroll Fact in Human Resources Data Warehouse

ByAdam Gilmore July 18, 2018November 1, 2023

This post is the first in a series of detailed cases studies discussing the ETL strategies that can be used when a Persistent staging layer is included in your Data Warehouse. It’s intended as a reference for developers using Dimodelo Data Warehouse Studio, our Data Warehouse Automation tool to quickly build a Data Warehouse. It’s…

Data Warehouse

What is a Star Schema (and why it’s important)

ByAdam Gilmore July 19, 2024July 23, 2024

A Star Schema is a data modelling technique used to model the presentation layer of a Data Warehouse. It refers to the way Facts and Dimensions in the model are related. A Star Schema is organized around a central fact table that is related to its Dimension tables using foreign keys in the Fact table….

Data Warehouse Bus Matrix

What is a Data Warehouse Bus Matrix? (and why you need one)

ByAdam Gilmore November 20, 2023November 22, 2023

A “Data Warehouse Bus Matrix” describes the high-level design of a Data Warehouse. At a glance, it shows all the facts and dimensions of a data warehouse and their relationships in a table-like ‘matrix’. It’s useful as a tool to design, plan, estimate and communicate your data warehouse. The “Data Warehouse Matrix” comes from the…

Data Warehouse

Metadata Driven Data Warehouse (MDW) vs Traditional ETL tools

ByAdam Gilmore January 13, 2020November 1, 2023

Here at Dimodelo Solutions, we are passionate about data warehouses and the benefits they can bring an organisation. But we are equally passionate about the art and practice (the how) of building these critical information assets. This is why we developed our Metadata Driven Data Warehouse tool “Dimodelo Data Warehouse Studio” (formally Dimodelo Architect). What…

Data Warehouse

Data Warehouse/Business Intelligence Requirements – How To ?

ByAdam Gilmore October 18, 2016November 1, 2023

How do you gather requirements for a Data Warehouse/Business Intelligence project ? Typically, on a BI project, if you ask a business user, ‘what do you want’ you will get one of 2 responses. I don’t know. I want everything. Which are, effectively, the same thing… As much as it pains IT people to hear…

Data Warehouse | Persistent Staging Layer

What is a Persistent Layer?

ByAdam Gilmore July 20, 2018November 1, 2023

The persistent layer contains as set of persistent tables that record the full history of changes to the data of the table/query that is the source of the Persistent table. The source could a source table/file, a source query, another staging table or a view/materialized view in the transform layer. In a persistent table there…

Similar Posts

Leave a Reply Cancel reply