Data Warehousing and Data Science

11 January 2011

4 Sessions for SQLBits 8, Brighton, April 2011

Filed under: Event — Vincent Rainardi @ 2:54 pm

I have submitted 4 sessions for SQLBits 8, 7-9th April 2011, Brighton:

1. Testing Your Data Warehouse Using SSAS

I wrote a book about Building a Data Warehouse. One chapter of my book is about testing a data warehouse. Testing a data warehouse is an art. On the one hand we need to make sure that the figures in the warehouse tally with the source systems. On the other hand there are billions of numbers to test. And to add complexity sometimes the numbers from the source are modified, on their way to the warehouse.  One technique to test the data warehouse is to use SSAS. We build a cube and put the numbers from both the source and DW into this cube. Not only this technique saves a lot of time, but we can quickly pin point where the problems are.

2. Advanced Dimensional Modelling

This session is for anybody involved in the design of a data warehouse. Many of us know what dimensions and fact tables are. But as I explained in my blog, dimensional modelling is more than just dimensions and fact tables. This session is about advanced dimensional modelling topics such as Fact Table Primary Key, Vertical Fact Tables, Aggregate Fact Tables, SCD Type 6, Snapshotting Transaction Fact Tables, 1 or 2 Dimensions, Dealing with Currency Rates, When to Snowflake, Dimensions with Multi Valued Attributes, Transaction-Level Dimensions, Very Large Dimensions, A Dimension With Only 1 Attribute, Rapidly Changing Dimensions, Banding Dimension Rows, Stamping Dimension Rows and Real Time Fact Table. Prerequisites: You need have a basic knowledge of dimensional modelling and relational database design.

3. Partitioning Your Fact Tables

In data warehousing, partitioning is one of the top 3 things that can speed up the performance (the other two being summary tables and indexing, as I explained in my book). Partitioning your fact table can speed up the performance up to 10 times, both query and loading. Yet 9 out of 10 the fact tables are not partitioned. One of the reasons is because initially there were only 5-10 million rows. After a year, the fact table contains 1 billion rows and loading time degrades, from 2 hour to 5 hours. Queries slows down too, from 1 second to 20 seconds. In this session I’m going to explain only partitioning, but I’ll leave some time for Q&A if you want to ask about the other two. Prerequisites: Requires a little bit of data warehousing knowhow.

4. SQL Server Data Warehousing

8 years ago people raised their eyebrows when you said you’re building a data warehouse on SQL Server. In today analogy, it is probably similar to saying that you are building a data warehouse on MS Access. In those days DB2 and Oracle reigned the market.

Thanks to Microsoft, today SQL Server is a respectable platform for data warehousing. And for business intelligence. It’s completely reversed now. If we say we are building a DW-BI on DB2 or Oracle, they question “Why not SQL Server?” This session in a way celebrates this golden era of SQL Server DWBI, just like my book. As time is limited I’m going to talk about the DW side only, not the BI side (RS, IS, AS) e.g. merge, change data capture, star join query, bitmap filter, change tracking, minimally logged insert, resource governor, backup compression, data compression and indexed views.

For further info about SQLBits please look at SQLBits web site,

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at

%d bloggers like this: