City is an attribute of the customer dimension. Why don’t we put city in its own dimension, with separate SK on the fact table? Because we need to analyse city in the customer context.
City is also an attribute of the supplier dimension. Why don’t we put city in its own dimension, with separate SK on the fact table? Because we need to analyse city in the supplier context.
Why don’t we snowflake and put city in its own table, linked to both the customer and supplier dimension? Because if we snowflake the structure it is more complex; we need to do more joins. And because we need to analyse city with two separate contexts: customer and supplier.
It is best to leave city on both the customer dim and the supplier dim (and the broker dim, and the store dim, where applicable). They will form different geography hierarchies on each dimension. On the customer dim it could be city-county-country-area-region. On the supplier it could be city-country-region.
The organisation (data) for the geography hierarchy on each dimension could also be different. The store dimension may have smaller areas than the customer dim. On the customer dim Luxemburg may belong to Western Europe area. But on the store dim Luxemburg may belong to Benelux area.
This design gives flexibility which often required by many large corporations. Contradict to popular believes that corporations need a single, golden standard (ala MDM), I found that many corporations require more flexibility than a single standard. This is because corporations are essentially group of companies and each part of the organisation have different way of doing business.
The case of industry sector is also similar. It is a hierarchy which grouped the customers, suppliers and partner organisations. Telecom, retail, healthcare, government, manufacturing, transport, technology, etc. This is particularly important in the B2B business, and in particular for financial institutions. In the UK and US we have SIC, Standard Industrial Classification. For example: 200 different SICs are then grouped into 80 subclasses, and these subclasses are grouped into 40 classes, which is then grouped into 6 hyperclasses.
Customer rank is an attribute of a customer. We order all active customers based on their orders in the last 12 months. So customer with rank = 700 is the 700th biggest purchaser in the company. The “valuation” is done on monthly basis. So a customer which is #700 on 1st January could be #800 on 1st Feb. Should we put rank in the customer dimension as a normal attribute?
I believe the answer is no. We should be putting customer rank on its own fact table. Why? Because it changes every month, it would be sensible to create a fact table with 2 SKs: month key and customer key. So the grain is 1 row for each customer for each month. The measures that we can put in this fact table are: rank, order value (last 12 months), order value (last 6 months), number of orders (last 12 months), number of products purchased (last 12 months), number of distinct products purchased (last 12 months), avarage order value (last 12 months), etc.
Unlike the other measures I just mentioned, rank is not really a measure, I know. It is an attribute.But because it changes every month, we put it in a separate fact table. What if we only calculate the customer rank not once a month but once a quarter? We still need to put rank in a separate fact table, not in the customer dimension. What if we calculate it once a year? We still need to put rank in a separate fact table. What if we calculate the customer rank every day? Then we definitely need to put it on a daily snapshot fact table.
The above monthly customer fact table could store a lot of other useful measures and attributes. Basically any monthly evaluation or calculation that concerns customer can be stored in this fact table. Various different customer ratios for example.
So you see, even though rank is (by entity definition, dependency, etc) an attribute/property of a customer, during dimensional modelling we could choose not to put it in the customer dimension.
As always I welcome suggestions and questions at email@example.com
Vincent Rainardi, 13/2/2012