Data Warehousing and Business Intelligence

11 March 2010

A dimension with only one attribute

Filed under: Data Warehousing — Vincent Rainardi @ 7:10 pm
Tags:

If a dimension only has one attribute, is it worth putting the attribute in the fact table instead? Why should we bother to create a separate table for that dimension, with 2 columns: dim key and that single attribute? That will make the query slow right? Consider a simple 1 dim 1 fact scenario:
create dim1 (dim1key int, attr1 varchar(20))
create fact1 (dim1key int, measure1 money)

A typical star join query for that mart is:
select d.attr1, f.sum(measure1)
from fact1 f
inner join dim1 d on d.dim1key = f.dim1key
group by d.attr1

If we put the attribute in the fact table:
create fact1(attr1 varchar(20), measure1 money)
we could query the fact table directly, which means better performance:
select attr1, sum(measure1) from fact1 group by attr1

Well, there are 3 reasons why we should put that single attribute in its own dimension table:

  1. To keep the fact table slim (4 bytes int rather than 20 bytes varchar)
  2. If the value changes in the source system, we don’t have to update the big fact table
  3. Yes it’s only 1 attribute today, but in the future we could have another attribute

There is an exception: snapshot month. We use snapshot month column in a periodic snapshot fact table. Snapshot month in the form of an integer (201003 for March 2010) doesn’t violate the 3 point above.

  1. It is an integer, not char(6).
  2. The value never changes, March 2010 will still be March 2010 forever
  3. There will not be another attribute (snapshot year? there is no such things)

How about snapshot date (20100311)? And snapshot week (201014)? Same thing, in my opinion. They can go directly on the fact table without having a dim table.

If on the other hand it’s a varchar (or char because the length is constant), then it needs to go to its own dimension table. For example: ‘2010-03-11’ (snapshot date), ‘2010-14’ (snapshot week) and ‘2010-03’ (snapshot month).

There is one more thing I need to mention: junk dimension (Kimball’s term). We could (sometimes should) combine dimensions with one attribute into a junk dimension.

Advertisement

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 37 other followers