Data Warehousing and Business Intelligence

21 August 2015

Distinct Attributes Rows in Dimension Tables

Filed under: Analysis Services — Vincent Rainardi @ 6:05 pm

There are two cases where it is a good idea to create rows in dimension tables containing distinct attributes values. The first one is prospect rows in the customer dimension. The second one is unsecuritised rows in the security dimension.

Prospects (prospective clients) can either be put in the customer dimension, or in a separate table (prospect dimension). Sales fact only uses customers, so there is no problem there. But Sales Pipeline fact uses both prospects and clients, so if we separate them into different dimensions, we have a problem.

Sales Pipeline contains information about the status of prospective clients, until they become clients (on-boarded). If we have 2 separate dimensions (DimProspect and DimCustomer) then on FactSalesPipeline we have to create 2 dim keys.

Fact Sales Pipeline

The top row is a client, and the bottom row is a prospect. Because prospects and customers share many similar attributes, it makes sense to put prospects in the customer dimensions. The problem is, in some companies, prospects have no business keys, unlike customers which have business keys (Customer ID). This is because prospects are recorded in Sales Pipeline book/system without registering them in the main transaction system.

But in their marketing analysis or pipeline, the business needs include both prospects and customers. Because of this, it makes sense to put them in one dimension tables. And because prospects have no business keys, the only way to check if a row already exists in the dimension, is by doing select distinct on all attributes, and combine them in one table (customer dimension):

DimCustomer

The second case, unsecuritised rows in security dimension is a little bit different (for background see here), but the potential solution is the same: distinct attribute rows. Unsecurised positions will have to handled using junk dimension, i.e. we put the security attributes such as country, rating, sector into a junk dimension, because we don’t have a business key to identify them in the security dimension.

Multi-legged positions such as IRS, currency swap and CDS are tricky to handle, because different legs may have different security attributes, e.g. the legs could be in different currencies, or different risk countries, or different asset types.

By putting a combination of these attributes into security dimension and giving it a surrogate key (just like in a junk dimension) the fact table could refer to those attributes. Of course for these rows there is no SCD; if the combination is not there then simply create a new row.

The advantage of doing this (combining both real securities and attributes for unsecuritised rows in the security dimension) are:

  1. The fact table only deal with 1 dimension, rather than two, or 20 (i.e. if we put individual attributes in their own dimension, each with SK directly on the fact table).
  2. We can create hierarchies, so we can “go up and down” the hierarchies, meaning that we can aggregate the measure at different hierarchy levels.
Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: