The convention amongst Kimball practitioners is to name the surrogate key columns as the dimension name + “_key”, i.e product_key, claim_key, risk_key. And this is done both in the fact table and in the dimension table. This way, when you create the BO universe, or SSAS cube, or MicroStrategy metadata/repository, it’s easier because we can say ‘link on the same name’
But recently I came across a different naming convention. In the fact table the surrogate key column is called PK_[Dim Name]_ID and in the dimension it is called FK_[Dim_Name]_ID. For example, the surrogate key of the product dimension is PK_Product_ID and in the fact table it is called FK_Product_ID. The benefit of this naming convention is so that in Toad, or in SSMS, when you type
select * from fact_daily_sales f
join dim_product dp on dp.
As soon as you type “.” the intellisense pops out displaying all the columns in dim_product and when you type “PK” there is only 1 column selected: PK_Product_ID. So it becomes:
select * from fact_daily_sales f
join dim_product dp on dp.pk_product_id = f.
And as soon as you type “.” the intellisense pops out displaying all the columns in fact_daily_sales table and when you type “FK” Toad/SSMS offers you all the surrogate keys in that fact table. Beautiful isn’t it?
Yes, but with 1 disadvantage: you can’t join on names between the fact table and the dimension table because the surrogate key columns have different names.
Name of Dimension and Fact Tables
Whilst I’m at it (naming), I might as well write about naming the dimension and fact tables. The convention amongst Kimball practitioners is to prefix the dimension table name with “dim_”, i.e. dim_[Dim Name]. And for the fact table the prefix is “fact_”, i.e. dim_[Fact Table Name]. For example: dim_product, dim_customer, fact_customer_profitability. This is because of the examples set by the Kimball Group, and by Microsoft SQL Server product team when they created Adventure Works data warehouse.
To differentiate transaction fact tables from snapshot fact tables, the data warehouse team in some companies added “_trans” or “_snapshot” suffix. And to differentiate accumulating snapshot fact table form periodic snapshot fact table, they add “_acc” suffix. For example: fact_sales_trans and fact_inventory_level_snapshot_acc. If it is a periodic snapshot, they mention the frequency, i.e. daily or monthly: fact_daily_sales_snapshot, fact_monthly_sales_snapshot.
In some companies, they shorten it further by naming the dimension table “D_” instead of “dim_” and the fact tables as “F_” instead of “fact_”. I personally prefer the D and F convention rather than “dim_” and “fact_”. Reason: we know that there are only 3 types of tables in the whole warehouse: fact, dimension and bridge.
Rather than using underscore (“_”) to separate words, I prefer to camel case it, to match Microsoft / .NET naming standard, i.e. fDailySales, dProduct, dCustomer. I prefer to type the full words rather than abbreviated them, i.e. instead of fProdOutput I type fProductionOutput.
I have yet come across data warehouse team who put the word “dim” and “fact” at the end, i.e. SalesFact, sales_fact, ProductDim or product_dim. Everyone seems to be putting the “dim” and “fact” in the front.
A data warehouse user once said to me: “Fact tables are fine, but dim tables? We want bright tables, not dim ones” 🙂