Data Warehousing and Business Intelligence

28 July 2011

Bridge Table with Multiple Instances

Filed under: Data Warehousing — Vincent Rainardi @ 5:20 am
Tags:

A bridge table is a fact table connecting 2 dimensions. A bridge table is usually used to solve “dimensions with multi valued attributes” as I showed in my previous article, “Dimensions with Multi Valued Attributes” here. Below is an example of a bridge table:

create table dim_product (product_key int, name varchar(50))
create table dim_colour (colour_key int, colour varchar(50))
create table bridge_product_colour (product_key int, colour_key int)

insert into dim_product values (1,'Product1'), (2,'Product2')
insert into dim_colour values (1,'Red'), (2,'Yellow'), (3,'Green')
insert into bridge_product_colour values (1,1), (1,2), (2,2), (2,3)

select * from dim_product
select * from dim_colour
select * from bridge_product_colour

-- Q1. How many products contain colour Yellow?
select count(*) from bridge_product_colour F
join dim_colour C on F.colour_key = C.colour_key
where C.colour = 'Red'
Output: 1

-- Q2. How many product for each colour?
-- (total is more than 3 because 1 product can have multiple colours)
select C.colour, count(*)
from bridge_product_colour B
join dim_colour C on C.colour_key = B.colour_key
group by C.colour
Output:
Red 1
Green 1
Yellow 2

Above is an example of a simple bridge table. It only connects the product and colour.

But sometimes, we have “instances”. Each item can have multiple instances. Each instance has different colour scheme, for example instance 1 of product 1 is red and yellow, whereas instance 2 is yellow and green.

In this case we need to create an “instance degenerate dimension” column in the bridge table, and we need to set the colour key to 1, 2, 4, 8, etc, so that when we sum the colour key, we will get a unique number.

An example of this is film. A film has multiple genres. Film1 is classified as “Action, Drama”. But Film1 can also be classified as “Action, Drama, Comedy”.

The example below illustrates this case:

create table dim_film (film_key int, title varchar(50))
create table dim_genre (genre_key int, genre varchar(50))
create table bridge_film_genre (film_key int, instance_dd int, genre_key int)

insert into dim_film values (1,'Film1'), (2,'Film2')
insert into dim_genre values (1,'Action'), (2,'Drama'), (3,'Action, Drama'), (4,'Comedy'), (5,'Action, Comedy'), (6,'Drama, Comedy'), (7, 'Action, Drama, Comedy')
insert into bridge_film_genre values (1,1,1), (1,1,2), (1,2,1), (1,2,2), (1,2,4), (2,1,1),(2,2,1),(2,4,2)

select * from dim_film
select * from dim_genre
select * from bridge_film_genre
film_key instance_dd genre_key
1	1	1
1	1	2
1	2	1
1	2	2
1	2	4
2	1	1
2	2	1
2	4	2
-- Q1. How many films contain Drama genre?
select count( distinct film_key) from bridge_film_genre where genre_key = 2
Output: 2

-- Q2. List of films containing Drama genre:
select distinct F.title
from bridge_film_genre B
join dim_film F on F.film_key = B.film_key
join dim_genre G on G.genre_key = B.genre_key
where G.genre = 'Drama'
Output:
Film1
Film2

-- Q3. What are the genres for each instance of Film1?
select B.instance_dd, G.genre from
( select film_key, instance_dd, sum(genre_key) as genre_code from bridge_film_genre
  group by film_key, instance_dd
) B
join dim_genre G on G.genre_key = B.genre_code
join dim_film F on F.film_key = B.film_key
where F.title = 'Film1'
Output
1  Action, Drama
2  Action, Drama, Comedy

-- Q4. For Film1 how many are Action, Crime, and Comedy?
select count(*) from
( select film_key, instance_dd, sum(genre_key) as genre_code from bridge_film_genre
  group by film_key, instance_dd
) A
join dim_genre B on B.genre_key = A.genre_code
join dim_film C on C.film_key = A.film_key
where C.title = 'Film1' and genre = 'Action, Drama, Comedy'
Output: 1

Notice how value genre_key is set to 1,2,4, etc so that its sum is unique. But we created rows in the dim_genre for genre_key = 3,5,6, and 7 to so that we can join back to get the description.

I would like to thank Rohinton Tarapore for highlighting the problem to me.

Vincent Rainardi, 27th July 2011

Advertisements

3 Comments »

  1. […] both are dim key columns, connecting the 2 dimensions (see my previous article about Bridge Table here*). But there are some cases where we have the 3rd column, which points to the 3rd dim. And this 3rd […]

    Pingback by Bridge Table with Date « Data Warehousing and Business Intelligence — 30 July 2011 @ 3:06 am | Reply

  2. 1)What is the purpose of instance_dd column and how to populate it?

    For Example:-(1,1,1)

    First 1 denotes it corresponds to film 1 based on film_key
    Second 1 is not clear to me?
    Third 1 denotes it is action film based on genre_key

    2)Here manually you are populating bridge_film_genre table using insert statements.
    But in real data ware house project how you will populate the Bridge table?Is it based on dimension tables or what is the logic for it?

    Comment by Vamsi Krishna — 27 September 2016 @ 9:38 am | Reply

    • Hi Vamsi,
      1) The purpose of instance_dd column is to differentiate two classifications that exist for film1, i.e. this film is known/classified as a “action, drama” film and also known/classified as a “action, drama, comedy” film. The second “1” means that this is the first known classification. If the instance_dd value is 2, it means that this row is the second known classification for this film.
      2) It is based on the source system which stores which film is known/classified as what genre(s). Note that this data is time variant, i.e. it changes over time. For example, film1 on 5th Jan 2016 was known as “action, drama” and “action, drama, comedy” as above, but on 25th June 2016 it is known as “action, drama, comedy” only. In this instance on 25th June 2016 we expire the rows with instance_dd = 1.
      Hope this helps.

      Comment by Vincent Rainardi — 28 September 2016 @ 9:00 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: