Data Warehousing and Data Science

18 December 2015

SQLBits XV Liverpool 4-7th May 2016

Filed under: Data Warehousing,Event,SQL Server — Vincent Rainardi @ 10:17 pm

I’ve posted 8 sessions for SQLBits XV in Liverpool. Hopefully one of them got chosen so I can to present it.

  1. Populating a Fact Table using SSIS
  2. Testing a SQL Server Data Warehouse
  3. Indexing Fact Tables in SQL Server
  4. SQL Server 2016 Data Warehousing & BI Features (deleted as it’s duplicate with another speaker)
  5. Using a SQL Server Data Warehouse for CRM
  6. Using a SQL Server Data Warehouse for BPM
  7. Loading a Dimension Table in SSIS
  8. T-SQL: Back to Basic

1. Populating a Fact Table using SSIS

In this session I will be doing a demo on my article “Populating a Fact Table using SSIS”: https://dwbi1.wordpress.com/2012/05/26/how-to-populate-a-fact-table-using-ssis-part1/ by going through it step by step on the screen. Giving the audience plenty of time to understand the mechanics and to ask questions.

When populating a fact table, people come across issues like these:

  • Where do I populate my fact table from?
  • How do I get the dimension keys to put into my fact table?
  • Where can I get the data for the measures columns?
  • With what do I populate the snapshot date column?
  • What is the primary key of my fact table?
  • The source table doesn’t have a primary key of the source table. What should I do?
  • I can’t find the row in the dimension table, what should I do?

As always, the best way to explain is by using an example. So in this session I’m going to do the following, and hopefully by the end of the session the above issues / questions in the audience’s mind will be answered.

  • Describe the background on the company and the data warehouse
  • Create the source tables and populate them
  • Create the dimension tables and populate them
  • Create the fact table (empty)
  • Build an SSIS package to populate the fact table, step by step

2. Testing a SQL Server Data Warehouse

The usual issue when we test a data warehouse is the sheer volume of data, which makes the normal testing method doesn’t work. Over the years, various ways have been developed to provide solutions to this problem, from a manual process to automated process.

In this session I will be demo-ing 3 approaches for testing SQL Server-based data warehouses:

  1. Using Manual Comparison
  2. Using Excel
  3. Using SSIS – SQL Server – SSAS

Along the way I will show the challenges, the mechanics, and the solutions.

Using Excel we will need to sort the data first. And then we use different formula for compare string columns, date column, logical column and numeric columns. We also need to incorporate the tolerance levels. Finally we can present the data quality for each column.

To verify the data in the data warehouse using the 3rd approach, first we need to match the rows to the source system. Then we verify the attribute columns. Then we verify the measure columns. SSIS facilitates the flow of data from both the source system and the warehouse into the test area, which is in SQL Server. A series of stored procedures then do the matching process in SQL Server, and compare the attribute columns and measure columns. The SSAS enables the testers to dissect the compared data, to find out which specific data areas are causing the issue.

3. Indexing Fact Tables in SQL Server

Index Fact Tables in SQL Server are different to Oracle or DB2 because of clustered index. Some people say that it is better to create a clustered index on the fact key, then non-clustered on each dimension key. Some say that it is better to use the clustered index on the snapshot date column. Of course the considerations are different between periodic snapshot fact tables, accumulative snapshot fact tables and transaction fact tables.

In this session I will go through the principles in indexing the 3 types of fact tables, including physical ordering, slimness of the clustered index, multi-column clustered index, a PK doesn’t have to be a clustered index, which dimension key column to index, when to include a column in an index, and of course partitioning, i.e. indexing partitioned fact tables. As always, it is better to explain by example than theory so I will give an example on each principles so that we can understand how it is applied in practice. For example: the performance comparison. I will also add my own “lessons learned”, i.e. mistakes I’ve made in the past so you can avoid making the same mistakes.

The title is “in SQL Server” because principles I will be explaining in this session are only applicable specifically to SQL Server data warehouses. It does not apply to Oracle, DB2 or Teradata data warehouses.

4. SQL Server 2016 Data Warehousing & BI Features

In this session I would like to go through the new features in SQL Server 2016 which are applicable to data warehousing and BI such as DB Engine: Stretch database, JSON, column-store indexes, memory-optimized tables, live query statistics, query store, temporal tables; SSAS: DBCC for SSAS, SSAS Extended event GUI, new DAX functions, tabular model partitions, roles in tabular; SSIS: custom logging in SSIS, Excel 2013 data sources, OData V4 data source, R in Execute SQL Task, HDFS connections, Auto Adjust Buffer Size, reusable control flow template; SSRS: Pin Report Items, project template for SSRS, PPTX format, PDF based printing, custom parameter pane, HMTL 5, Report Builder UI, Tree Map, Sunburst Charts, High DPI, Subscription; MDS: data compression at entity level, super user function, CRUD permissions, transaction log.

I may not be able to fit all of the above points into the session, but will try to cover as many as possible. It is essentially a session which goes through the new features in SQL 2016. Wherever possible I will try to demo the feature rather than talking about it on PowerPoint. Because by seeing we learn a lot more than by discussing them in theory. Also for each feature I will mention how they can be used in data warehousing or BI development. If a feature already exist in 2014, but it is enhanced in 2016 I will show only the improvement, and not going through the fundamentals of that feature.

5. Using a SQL Server Data Warehouse for CRM

This session is taken from my article “Using a Data Warehouse for CRM”, which I wrote with Amar Gaddam: https://dwbi1.wordpress.com/2010/01/28/using-data-warehouse-for-crm/

A data warehouse is not only for BI. It can also be used from Customer Relationship Management (CRM). In fact, a DW is the best platform for doing CRM. In this session I will show how to use a data warehouse built in SQL Server for doing core CRM activities such as: 1. Single Customer View, 2. Permission management, 3. Campaign segmentation, 4. Campaign Results, 5. Customer Analysis, 6. Personalisation, 7. Customer Insight, 8. Customer Loyalty Scheme.

If you don’t work in CRM and not familiar with the above concepts, don’t worry. I will explain them one by one during this sessions. Although it would be help a lot if you read my article above, before coming to this session.

For each point I will try to show the table design in SQL Server DW so we can all see how they are implemented, not just the theory. Due to time limit I may not be able to cover all the above points, but I promise I will try to cover as much as possible.

6. Using a SQL Server Data Warehouse for BPM

BPM = Business Performance Management. A Data Warehouse is not only used for BI. It is also used for CRM and BPM. In this session I will show how to use a Data Warehouse for BPM using Balanced Scorecard. Before the session I will have built a Data Warehouse on SQL Server, for BPM, and during the session I will show the design of this DW.

The Data Warehouse will contain Sales data, Financial data, Customer Service data, and Order Processing data. Each of this data will form part of the Balanced Scorecard. In addition to Fact and Dimension tables, a Data Warehouse used for BPM contains one additional area which stores the KPI scores. I will show how this area is designed, and how the KPI values and score are calculated.

As a take away, I hope the audience can learn how a DW is used outside BI, how the additional area are designed and built, and how it is all implemented on a SQL Server platform. I will also show how the SSIS packages that populate the Data Warehouse from the source system, and the SSRS reports which shows the KPIs, the Balanced Scorecard and Performance Scoring calculation.

7. Loading a Dimension Table in SSIS

We can load a dimension table in SSIS using a) SCD Transformation, b) Merge Join + Conditional Split, c) Change Data Capture, d) Merge command, and e) Upsert using Execute SQL. In this session I will be showing/demo-ing these 5 approaches on the screen one by one, then compare them in terms of efficiency/performance, code clarity/maintenance, and ease of build. It is based on my article: https://dwbi1.wordpress.com/2015/09/09/loading-a-dimension-table-using-ssis/

SCD Transformation and Merge Join + Conditional Split are both using row-by-row operation hence not efficient compared to Upsert. CDC is a mechanism to extract the data changes. To load a dimension table we need to read the CDC output table, and update or insert into the dimension table based on the _$Operation column. The Merge command is buggy, has concurrency issues, requires an index to support performance, and does the Insert twice.

In every data warehouse project, we need to load many dimension tables. So this is a fundamental knowledge to know, for those of us who uses SQL Server and SSIS for your warehouse.

8. T-SQL: Back to Basic

In SQL Server we usually look forward to working with the latest features, such as SQL 2016, Power BI, ML, Azure, etc. So much so that we often forget the basics such as creating/modifying a constraint, PK, FK, index, trigger, partitioned table, synonym; joins, correlated subquery, update from, output an SP into a table, cast, NULLIF, variables, temp tables, table variables, cross apply, while, case; string/date functions, row number, transaction, except, rank, find duplicate rows, etc.

This session is intended to be a refresher, i.e. we know all the above but we forget. We will go back to basic. I will cover 3 sections: a) creating database objects, b) database development, c) database administration. This session is based on my article: https://dwbi1.wordpress.com/2014/12/27/sql-server-scripts/. So this session will consists of many short SQL scripts (T-SQL) and only T-SQL, i.e. I won’t be using the GUI. I won’t be able to cover every single script in that article (there are over 250 in total!), but I will pick the important ones, and avoid the ones which are similar to the others.

29 June 2014

SQLBits 12 Telford

Filed under: Event — Vincent Rainardi @ 5:19 pm

The 12th SQLBits will be in Telford, 17th to 19th July.

On Thursday 17th July there will be 14 world class experts providing 11 one-day seminars in their specialised topics:

  1. Brent Ozar: Virtualization, SAN and Hardware for SQL Server
  2. Itzik Ben-Gan: Practical T-SQL – Efficient Solutions
  3. Jennifer Stirrup: The Data Analyst Toolkit: Microsoft Power BI & R
  4. Marco Russo: Creating a Power BI Solution in one day
  5. Brian Knight: SSIS Problem, Design, Solution
  6. Dejan Sarka: Advanced Data Modeling Topics
  7. Adam Jorgensen & John Welch: Real World Big Data – Deploy, Design and Manage Like a Pro
  8. Niko Neugebauer & Andreas Wolter: In-Memory Technologies in SQL Server 2014: CCI & XTP
  9. Allan Hirt: The A to Z of Availability Groups
  10. Dave Ballantyne & David Morrison: SQL Server Query Processing Internals
  11. Simon Sabin: SQL Server Performance Diagnosis and Query Tuning

On Friday 18th July there will be 13 DBA sessions, 10 Database Development sessions, and 12 BI sessions (plus 1 keynote speech). These 35 sessions will be conducted in 8 rooms i.e. 8 one-hour sessions running in parallel.

DBA:

  1. Denny Cherry: Using SQL Server Always On Availability Groups
  2. Brent Ozar: Watch Brent Tune Queries
  3. Klaus Aschenbrenner: Latches, Spinloc and Lock Free Data Structures
  4. Simon Sabin: Achieving PCI With SQL Server – Don’t be the next Sony
  5. Allan Hirt: Patch Management Strategies for SQL Server Deployments
  6. Ola Hallengren: Inside Ola Hallengren Maintenance Solution
  7. Hue Holleran: SAN eye for the SQL girl/guy
  8. Tim Ford: Tim’s Top Ten Templates
  9. Thomas LaRock: Cardinality Estimates in Microsoft SQL Server 2014
  10. Gail Shaw: Do’s and don’ts of database corruption
  11. Grant Fritchey: Monitoring Windows Azure SQL Server VMs and SQL Databases
  12. Gavin Payne: Designing Quality SQL Server 2014 Solutions
  13. Matan Yungman: Query Progress Tracking in SQL Server

Database Development:

  1. Dejan Sarka: Optimizing Temporal Queries
  2. Aaron Bertrand: T-SQL: Bad Habits and Best Practices
  3. Karen Lopez: Database  Design Contention Issues
  4. Benjamin Nevares: Dive into the Query Optimizer – Undocumented Insight
  5. Allan Michell: The secret to pain-free database deployments? Automation.
  6. Mladen Prajdic: Optimizing database access and diving into .Net SqlClient
  7. Christina E. Leo: Building Your T-SQL Toolkit: Window Function Fundamentals
  8. Andreas Wolter: “SQL Attack..ed” – SQL Server under attack: SQL Injection
  9. Itzik Ben-Gan: T-SQL Tips and Tricks
  10. Scott Klein: Patterns & Practices for Scaling Windows Azure SQL Database

BI:

  1. Benjamin Wright-Jones: Microsoft Analytics: the Next Wave
  2. Chris Webb: Power Query: Beyond the Basics
  3. Allan Mitchel: A Look at 3 Big Data Tools
  4. Bob Duffy: Optimizing Cube Processing
  5. Jenifer Stirrup: A One Hour Data Analyst Toolkit: using R and Power BI
  6. David Peter Hansen: SSIS Internals and Performance
  7. Marco Russo: DAX Patterns
  8. Allan Mitchell: Basket Analysis using BI Office and SSAS Tabular edition
  9. John Welch: Continuous Delivery for Data Warehouses and Marts
  10. Marco Russo: DAX Patterns
  11. Brian Knight: Performance Tuning SQL Server Integration Services
  12. Alberto Ferrari: Advanced Modelling with Analysis Services Tabular

On Saturday 19th July there will be 18 DBA sessions, 14 Database Development sessions and 16 BI sessions (plus 1 Career session), starting from 8:10 until 17:00.

DBA:

  1. David Peter Hansen: Integration Services (SSIS) for DBA
  2. Chris Adkin: Column Store Index and Batch Mode Scalability Deep Dive
  3. Richard Douglas: The Day After Tomorrow: Why You Need a Baseline
  4. Katherine Bean: Data Security in a Post Snowden World
  5. Dr Subramani Paravasivam: Advanced Reporting Technique and Managing Reports
  6. Allan Hirt: SQL Server Storage vNext: Welcome to the Future
  7. Grant Finchley: Getting Started Reading Execution Plans
  8. Rob Volk: Revenge: The SQL!
  9. Matan Yungman: Things You Can Find in Plan Cache
  10. Eduard Erwee: Hadoop: Big Data or Big Deal?
  11. Mark Broadbent: The Nicromonicon, SQL Server Book of the Dead
  12. Christian Bolton: Extending Your Data Centre to the Cloud with SQL Server 2014
  13. James Skipwith: SQL 2014 In-Memory Design Pattern #1: The Flaming Partition
  14. Brent Ozar: How to Pick SQL Server Hardware
  15. Scott Klein: Top 5 SQL Server 2014 Hybrid Features
  16. Andre Kamman: How to Write Professional PowerShell Scripts
  17. Thomas LaRock: Monitoring Databases in a Virtual Environment
  18. Neil Hambly: Effective Index Partitioning, Compression Strategy

Database Development:

  1. Benjamin Nevarez: Understanding Parameter Sniffing
  2. Itzik Ben-Gan: T-SQL Tips and Tricks
  3. Geoff Clark: T4 Templating with SSDT using SQL Server 2014
  4. Criag Ottley-Thistlethwaite: The SSDT Way or the Highway (sqlproj)
  5. Brent Ozar: How the SQL Server Engine Thinks
  6. Andrew Whettam: A Comedy of Error
  7. Denny Cherry: I’ve Got a SQL Database, Now What?
  8. Gail Shaw: Bad Plan! Sit!
  9. Klaus Aschenbrenner: The Dangerous Beauty of Bookmark Lookups
  10. Dave Ballantyne: Query Optimizer Internals: Traceflag fun
  11. Karen Lopez: Windows Azure SQL Database Design: Concepts and Trade-offs
  12. David Morrison: SQL Tips, Tricks and Misconceptions
  13. Steve Jones: Continuous Integration for Databases
  14. Bob Duffy: Migrating to the Cloud

BI:

  1. Jean-Pierre Riehl: Fasten Seatbelt and Look at the Data Steward
  2. James Rowland-Jones: Demonstrating PDW Integration with Hadoop & Polybase
  3. John Welch: Practical Unit Testing for SSIS Packages
  4. Carmel Gunn: The Irish Economic Crisis, Visualised with Power BI
  5. Mark Stacey: Is Your Data Big Enough for PDW
  6. Marco Russo: Power Query in Modern Corporate BI
  7. Peter ter Braake: Power BI, Zero to Expert
  8. Alberto Ferrari: Optimizing & Analyzing DAX Query Plan
  9. Niko Neugebauer: ETL Patterns with Clustered Columnstore Indexes
  10. Julie Koesmarno: Building Your Myth Busting Lab with Power BI
  11. Allan Mitchell: Introduction to BI Office
  12. Stephanie Locke: Intro to R
  13. Alex Whittles: MDX 101
  14. Adam Aspin: BI Tips and Tricks with SQL Server Reporting Services
  15. Gerhard Brueckl: Deep-Dive to Analysis Services Security
  16. Stacia Misner: Building a BI Performance Monitoring Solution

Career:

  1. Steve Jones: Branding Yourself for a Dream Job

Source: Retrieved from SQLBits XII Agenda on 29th June 2014 at 16:52

 

24 June 2014

SQL Server User Group Indonesia

Filed under: Event — Vincent Rainardi @ 7:44 am

I just found out today from Julie Koesmarno’s blog (link) that there is a SQL Server User Group in Indonesia. I’m from Indonesia (Surabaya) and it was a nice surprise to hear about this. They have a Yahoo Group: link, where they discussed all things SQL Server, from Configuration, SQL Programming to Data Warehousing. The moderator is Choirul Amri, link. SSUG Indonesia is listed on the SQL PASS Asia Pacific site (link), but the website it is pointing to is still not working: http://sqlserver-indo.org/.

The day after tomorrow, 26th June at 18:45 Jakarta time, Indonesian SSUG is having a gathering as Choirul Amri wrote. The agenda is:

18.15 – 19.00 Registration
19.00 – 19.45 Session 1: Azure Survival Guide for [On Premise] SQL Server DBA, by Choirul Amri, Senior Consultant – Microsoft Services
20.00 – 21.00 Session 2: Building Your Myth Busting Lab With Power BI, by Julie Koesmarno, SQL Server MVP
21.00 – 21.15 Wrap up and closing

27 December 2011

SQLBits 10 London

Filed under: Event — Vincent Rainardi @ 7:20 am

The 10th SQLBits conference will be on 29th to 31st March 2012 (Thursday to Saturday) in Novotel London West, London W6 8DR.

This time, the SQLBits will not be just a SQL Server Conference. It will also be the place where Microsoft will announce SQL Server 2012, which is known as Denali. See here [link] for new features of SQL Server 2012.

I have submitted a session for SQLBits 10, titled “Data warehousing bits from my experience” (link).

In the previous SQLBits sessions I’ve shared pretty much the theory of dimensional modelling. See my previous sessions here, here and here.

Enough with the theory now I’d like to share my observations and experience during various data warehousing projects I’ve been and from helping lots of people with issues on their DW projects who contacted me via my blog. And from consulting experience (sometimes between contracts I do consulting, 1 day here and there, advising clients on their DWs) It is very useful to learn what went wrong, how we could better it next time, etc. Be it design decision, testing methods, dimensional modelling, loading mechanism, or cube/BI there are various aspects that I’ve been through that I can share. Of course, there will be no company names, for client confidentiality reason. All that will be I presented will be real cases, but I will have to annonymise them. True that I can’t share any of their data, but I can share the principles and technique of how we addressed those issues and situations.

I take on board the feedback from previous session in Brighton that you don’t like me to “read the presentation”. You would like me to be shorter on the text on PowerPoint slides and tell you more about the slides on the microphone. OK, I will do that.

1 July 2011

SQLBits 9 Liverpool

Filed under: Event — Vincent Rainardi @ 5:33 am

The next SQLBits will be in Liverpool, on 29th September (Thursday) to 1st October (Saturday).

It will be at The Adelphi Hotel, Ranelagh Place, L3 5UL, very near Liverpool Central Rail station.

The sessions are here. The speakers are here.

Steve Jones will be presenting (if chosen). I was in contact with him when I was writing articles for SQL Server Central, of which he was (still is) the editor. SQL Server Central is the largest SQL Server forum in the world. He submitted 2 sessions: “Branding yourself for a dream job”, and “Preparation for disaster”.

Duncan Sutcliffe submitted 3 sessions: “Using  Data Capture in ETL Routines”, “Kerberos – All you need to know in 1 hour”, “Making Full Use of SharePoint for Business Intelligence”. That Kerberos session will be very interesting, as a lot of people implementing SSAS client tool / SharePoint / .NET front end app in corporations are caught up in the “Configuring Kerberos” business. After a few weeks configuring Kerberos in a client, a Kerberos expert said to me “If we can package our Kerberos knowledge  into a product, we’ll be rich!”

Marco RussoAlberto FerrariAlex Whittles and Jennifer Stirrup also submitted their sessions.

SQLBits 4 Manchester

Filed under: Event — Vincent Rainardi @ 5:24 am

In this session I will explain what a dimensional data model is, why do we use it for data warehousing, when to use and when not to use, what are the advantages and disadvantages and what are the alternatives. We will put dimensional modelling into practice by designing a data mart using a case study. That way we can get first hand experience about how a 3rd normal form transactional database is translated into fact and dimension tables in a dimensional model. Time permitting, we will also discuss some particular points in dimensional modelling such as smart date key, choosing dimensional grain, and real time fact table.

The PowerPoint presentation is here.

The SQLBits page is here.

11 January 2011

4 Sessions for SQLBits 8, Brighton, April 2011

Filed under: Event — Vincent Rainardi @ 2:54 pm
Tags:

I have submitted 4 sessions for SQLBits 8, 7-9th April 2011, Brighton:

1. Testing Your Data Warehouse Using SSAS

I wrote a book about Building a Data Warehouse. One chapter of my book is about testing a data warehouse. Testing a data warehouse is an art. On the one hand we need to make sure that the figures in the warehouse tally with the source systems. On the other hand there are billions of numbers to test. And to add complexity sometimes the numbers from the source are modified, on their way to the warehouse.  One technique to test the data warehouse is to use SSAS. We build a cube and put the numbers from both the source and DW into this cube. Not only this technique saves a lot of time, but we can quickly pin point where the problems are.

2. Advanced Dimensional Modelling

This session is for anybody involved in the design of a data warehouse. Many of us know what dimensions and fact tables are. But as I explained in my blog, dimensional modelling is more than just dimensions and fact tables. This session is about advanced dimensional modelling topics such as Fact Table Primary Key, Vertical Fact Tables, Aggregate Fact Tables, SCD Type 6, Snapshotting Transaction Fact Tables, 1 or 2 Dimensions, Dealing with Currency Rates, When to Snowflake, Dimensions with Multi Valued Attributes, Transaction-Level Dimensions, Very Large Dimensions, A Dimension With Only 1 Attribute, Rapidly Changing Dimensions, Banding Dimension Rows, Stamping Dimension Rows and Real Time Fact Table. Prerequisites: You need have a basic knowledge of dimensional modelling and relational database design.

3. Partitioning Your Fact Tables

In data warehousing, partitioning is one of the top 3 things that can speed up the performance (the other two being summary tables and indexing, as I explained in my book). Partitioning your fact table can speed up the performance up to 10 times, both query and loading. Yet 9 out of 10 the fact tables are not partitioned. One of the reasons is because initially there were only 5-10 million rows. After a year, the fact table contains 1 billion rows and loading time degrades, from 2 hour to 5 hours. Queries slows down too, from 1 second to 20 seconds. In this session I’m going to explain only partitioning, but I’ll leave some time for Q&A if you want to ask about the other two. Prerequisites: Requires a little bit of data warehousing knowhow.

4. SQL Server Data Warehousing

8 years ago people raised their eyebrows when you said you’re building a data warehouse on SQL Server. In today analogy, it is probably similar to saying that you are building a data warehouse on MS Access. In those days DB2 and Oracle reigned the market.

Thanks to Microsoft, today SQL Server is a respectable platform for data warehousing. And for business intelligence. It’s completely reversed now. If we say we are building a DW-BI on DB2 or Oracle, they question “Why not SQL Server?” This session in a way celebrates this golden era of SQL Server DWBI, just like my book. As time is limited I’m going to talk about the DW side only, not the BI side (RS, IS, AS) e.g. merge, change data capture, star join query, bitmap filter, change tracking, minimally logged insert, resource governor, backup compression, data compression and indexed views.

For further info about SQLBits please look at SQLBits web site, www.sqlbits.com.

23 June 2010

SQLBits7

Filed under: Event — Vincent Rainardi @ 8:24 pm
Tags:

SQLBits 7, the largest SQL Server conference in Europe, will take place in York, 30th Sep to 2nd Oct. Different from SQLBits 6 which was only 1 day, SQLBits 7 will take the same format as SQLBits 5. It will be 3 days:

Thursday: in-depth, full day seminar (pay)
Friday: deep-dive, advanced session conference (pay)
Saturday: community day (free)

Submit a session for SQLBits

SQL Server 2008 Data Warehousing Features

Covers data warehousing features in SQL Server 2008 such as merge, change data capture, star join query/bitmap filter, change tracking, minimally logged insert, resource governor, backup compression, data compression and indexed views.

For each point I will explain what they are, why should we use them/what can we use them for in data warehousing, and a demo so you can see it for yourselves.
This is an introductory session. I won’t go into deep technical details about the mechanism of how they work. My goal is to introduce them to you, and if you are interested, you can study them further. Many SQL Server data warehousing “shops” have not used/taken advantage of these features, which is ashame because they are very useful.

18 March 2010

SQL Server 2008 Data Warehousing Features Presentation

Filed under: Event — Vincent Rainardi @ 12:38 am
Tags:

My presentation tonight, 17th March 2010, at the SQL Server User Group meeting at Microsoft, Victoria, is here, including all the scripts. It is titled SQL Server 2008 Data Warehousing Features.

Neil Hambly presented Indexed Views and Computed Columns. Duncan Sutcliffe from Hitachi (was Edenbrook) presented BI in Office 2010 (Excel, PowerPivot and SharePoint). Both were excellent. Chris Testa-O’Neill and Ashwani Roy were the hosts, helping answering questions. Jamie Thomson was also there, helping out with the Q&A session. The meeting was organised by Tony Rogerson.

11 March 2010

SSUG BI Evening and SQLBits 6

Filed under: Event — Vincent Rainardi @ 11:21 pm

On 17th March 2010 I’ll be speaking at SSUG: SQL Server User Group BI Evening, Microsoft Victoria, London, 18:00-21:00, SQL Server Data Warehousing features. Details below.

16th April 2010: SQLBits VI, Church House conference centre, Westmister, London, my proposed session is Query and Loading Performance of a SQL Server Data Warehouse. Details below.

See you there.

SQL Server User Group Evening Meeting on Wed Mar 17, Microsoft Victoria, London

SQL 2008 Data Warehousing Features (Vincent Rainardi); Index Views and Computed Columns (Neil Hambly); Business Intelligence in Office 2010 – Excel, PowerPivot, SharePoint (Duncan Sutcliffe); finishing with a Q & A on BI with the panel

LiveMeeting Attendee URL:  Click here to join Meeting
Time: Starts (UK time) at 18:00, Finishes 21:00
Cost: Free
Organiser: UK SQL Server User Group
Address: Microsoft London, Cardinal Place, 100 Victoria Street, London, SW1E 5JL
Directions to Event
Tags: Internals for Beginners; Indexing; Database Structure; File Groups
Note: This event has a max capacity of 102 – this event is now full; we are operating a reserve list – feel free to register as I will be sending reminders out a few days before the event to make sure if you have registered you are still coming.

Come and socialise and learn from your peers; these physical meetings are great places to expand your network, get answers and find out how other people are using SQL Server and what is going on. This is the last meeting this year for London and for those who can we should do drinks after the meeting to continue the SQL chat in an even more informal environment.

If you want to twitter please make sure you use the tag #uksqlug so it is shown on the site.

Your host for the evening will be Ashwani Roy, SQL Server MVP.

Agenda

18:00 – 18:15 Meet & Greet
Meet up and socialise with your friends, meet new people, find out what other people are doing with SQL Server.

18:15 – 19:00 SQL Server 2008 Data Warehousing Features
Vincent Rainardi; http://www.datawarehouse.org.uk

In this session I will explain about 2008 Data Warehousing features such as merge, CDC, data compression, backup compression, star join query, minimally logged insert, parallel query on partition tables, partitioned aligned indexed views, resource governor, grouping sets. I will pick some of them and show implementation example in the form of a demo.

So this session will be particularly useful for people/companies who already have a data warehouse on SQL Server 2005, as it explains to them the benefits of upgrading their warehouse to 2008. It will also be useful for companies who don’t currently have a data warehouse on SQL Server (perhaps they have it on other platform), and are currently deciding whether to use SQL Server 2008 for their warehousing platform. And of course this will benefit Microsoft, as it draws people’s attention to using SQL Server 2008 for their data warehousing platform.

19:00 – 19:15 Using Indexed Views & Computed Columns for Performance
Neil Hambly, http://sqlblogcasts.com/blogs/NeilHambly

One of the key elements of any DW / OLTP system is performance, alas no matter how good it’s design or how powerful it’s hardware there are times when the performance of a Query operation / function is JUST NOT adequate, even if properly tuned these times generally we are likely to be dealing with lots of data or complex calculations SQL Server has ways to help deal with those scenarios, some of these include the use of Indexes Views or Computed Columns I will spend a few minutes explaining their usage, Pro’s & Con’s and demo some examples.

19:15 – 19:35 Break for Pizza sponsored by Microsoft

19:35 – 20:30 Business Intelligence in Office 2010
Duncan Sutcliffe, Hitachi Consulting (http://www.hitachiconsulting.co.uk)

This session will look at the improvements to self-service BI made in Excel 2010 and SQL Server 2008 R2. It will examine native Excel functionality, the PowerPivot for Excel tool, and deployment of PowerPivot in SharePoint 2010. The session will be in the form of a live demonstration using the most recent CTP and beta software.

20:30 – 21:00 Panel – Q & A session on Business Intelligence
Ashwani Roy and Chris Testa-O’Neill

SQLBits VI, 16th April 2010, Church House conference Centre, Westminster, London
Query and Loading Performance of a SQL Server Data Warehouse
This session discusses the impact of various design/modelling aspects and SQL Server features to the query and loading performance of a SQL Server data warehouse.
  • How various data warehouse design/dimensional modelling aspects like fact table primary key, foreign keys, snowflaking, fact grain, transaction dimension, slowly changing dimension type 2, periodic snapshot, incremental extract, accumulated snapshot and fact normalisation affects the data warehouse query performance and loading performance.
  • How SQL Server specific features affect (can be used to improve) the query and loading performance of a SQL Server data warehouse, for example: partitioning, clustered index, indexed views, filtered indexes, bitmap indexes, change data capture, data compression, merge statement, incremental delete, query/table hints, joins, data types, minimally logged insert, data/log files, filegroups, autogrow, tempdb, recovery model, and scalable shared database.

For a complete list of sessions click here. SQLBits home page is here.
Update: I won’t be speaking at SQLBits VI. My proposed session above was not accepted.

Next Page »

Blog at WordPress.com.