Data Warehousing and Business Intelligence

8 February 2012

Parent SK in the Fact Table

Filed under: Data Warehousing — Vincent Rainardi @ 7:26 am
Tags:

This article is about customer dimension. For B2B (business to business), each customer may have a parent company. The parent in turn has another parent. I encountered this for the first time when I was working for a utility company. We tracked up to 3 levels up and put this grandparent SK in the fact table. This link was used widely throughout the BI reports.

In some industries like asset management and law they call it client dimension. In other industries it is called party dimension (insurance) or counterparty dimension (banking). In this article I’m going to call it customer dimension, a term which everybody is familiar with.

The idea is simple. We put the Parent SK in the fact table. This saves us from doing self-join on the customer dimension. Suppose we have a customer dimension like this:

And a fact table like this:

TLP means Top Level Parent. In practice, this is what the users are mostly interested in. They want to analyse the measures based on the TLP. They are less interested in the parents at intermediate level. They would like to organise the sales structure around the TLP. They may need to calculate the bonus based on the TLP. When I was designing the warehouse for a utility company, TLP was used a lot. A lot of BI analysis was done on TLP. Many measures were analysed against it. When I was working in investment banking industry, they also use TLP a lot. I believe any B2B organisation (not only companies but also government) have a common interest in TLP.

There are 2 design patterns. We can put the TLP information in a separate dimension. Or we can put the TLP information in the customer dimension. I prefer the first one. The reasons:

a)    Flexibility. There are attributes which are specifically for TLP. They are not applicable for leaf-level customers (here on I will call them LLC). For example, number of companies is only applicable for TLPs, it is not applicable for LLCs.

b)    Performance. The join to TLP dim gives faster query response times, i.e. if the query only requires TLP information, we do not need to touch the LLC dimension. LLC dim could be big whereas TLP dim is a lot smaller. 10 times smaller is common.

We always have to look from both sides of the coin. The disadvantage of this approach is that we need to add TLP_Key on many fact tables, in addition to the customer key. I say that this is a small price to pay for the huge benefit we have.

As always I welcome comments and questions at vrainardi@gmail.com

Vincent Rainardi, 8/2/2012

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: