Data Warehousing and Business Intelligence

12 July 2012

Delete All Rows in the Dimension Table

Filed under: Data Warehousing — Vincent Rainardi @ 5:32 pm
Tags:

I was very surprised to receive an email this morning from somebody saying that they delete the whole dimension table and insert all rows again, and they use surrogate keys. And they are having problem because the existing rows get new surrogate keys.

Whenever I heard “delete all rows in the dim table” a big alarm bell rang in my head. The word “delete” and “dim table” should not be in the same sentence!

This is the second time this month I encountered people doing this and I suspect there are many others out there doing the same thing, and having exactly the same problem. So I thought I have to write about this asap. So here it goes.

As always it is better to explain using an example. Say it’s a Member dimension. In the source system we have Membership table, like this:

And in the data warehouse we have Member dimension, like this:

There are 2 types of changes happening in the source Membership table:

  • Change to an existing row
  • New row created

Let’s make these 2 types of changes: change Ayane’s name to Eyane and create a new member: Agus Salim, like this:

When we truncate the Member dimension and reload all rows, this is what we get:

All members get new surrogate keys.
G48’s surrogate key changed from 1 to 3.
G49’s surrogate key changed from 2 to 4.
That is the issue about “And they are having problem because the existing rows get new surrogate keys” that I mentioned at the beginning of this article.

The problem with G48’s SK changed from 1 to 3 is: the fact table row for G48 is still referring to SK = 1. Now the fact table SKs don’t match with the dim table SKs, causing issues when you join them.

As I said above, we should not delete from the dim table. Instead, we should update changed rows, and insert new rows. After the update, the dim table should be like this:

So once again, the word “delete” and “dim table” should not be in the same sentence!

 

About these ads

4 Comments »

  1. Good and simple explanation.
    But there’s a little hick up in this sentence “As I said above, we should not delete from the fact table”

    Comment by Geoffrey — 15 August 2012 @ 6:37 am | Reply

  2. I had had cases where I deleted dim tables, but I know the consequences. By this post, can you envisage where truncating a dim table is evitable?

    Comment by abacusdotcom — 25 February 2014 @ 11:55 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 188 other followers

%d bloggers like this: