Data Warehousing and Data Science

28 January 2010

Using a Data Warehouse for CRM

Filed under: Data Warehousing — Vincent Rainardi @ 8:13 am

Usually a data warehouse is used for Business Intelligence. In this article we will describe another usage of a data warehouse: for Customer Relationship Management (CRM).

Customer Relationship Management is a set of activities performed by a company or organization (business and non-business) to manage and analyse information about their customers, to keep in contact and communicate with their customers, to attract and win new customers, to market products/services and conduct transactions with their customers and to service and support their customers. For non-business organisations the word “customers” can be substituted with “citizens”, “users”, “stake holders”, “applicants”, “visitors”, “constituents”, “interested parties”, “students” or similar, as long as the term describes the people that the organization serves.

A CRM system is a package of applications that supports the above activities. Among various functionalities of a CRM system, below are functionalities that are ideally supported by a data warehouse or utilising the data from a data warehouse. Other functionalities may best be served by an Operational Data Store or front end applications. Please see the table at the end of this article for the details about which tool is best for which functionalities.

  • Single customer view
  • Permission management
  • Campaign segmentation
  • Manage deliverability
  • Customer services/support
  • Customer analysis
  • Personalisation
  • Customer loyalty scheme

Let’s discuss these functionalities one by one.

Single Customer View

One of the most important things in CRM data management is the concept of Single Customer View. This concept was raised because customers could be defined differently depending on the context and to which department we are talking to. For example, a customer could be defined as follows:

  • A customer is anybody who has purchased from us.
  • A customer is anybody who has registered with us.
  • A customer is anybody who has paid us.
  • A customer is anybody who is currently subscribed to our newsletters.

And on top of that we have to deal with variations and complications such as:

  • Order cancellations: what if the customer has ordered (purchased) from us but before we deliver the goods he cancelled his order? Do we count him as a customer? Hmm, may be not.
  • Contract termination: what if a customer signed a contract for a particular service from us for a year then the following year she did not renew the contract? Do we still count her as a customer? Perhaps not.
  • Unsubscriptions: the customer has subscribed to our newsletter and then unsubscribed from that newsletter. Do we count him as a customer? May be not.
  • Order life cycle: order fulfillment process consists of many stages: quotation produced, contract signed, account opened, order placed, order dispatched (for physical goods), order fulfilled/delivered, consumed (for services), invoiced, paid, returned, credit note raised, refunded, account closed. So, at what stage do we recognized them as a customer? Different industries have different order stages.
  • Even tough it is not technically a customer yet (technically it may be a prospect), some departments such as marketing need us to store the prospect data.
  • When does a customer stop becoming a customer? When they no longer consume our products or services? In some industries such as waste management, the process of ‘discharging’ a customer is done in stages, for example: stopping the collections, removal of bins, settlement of accounts, deactivate customer status. In some industries there is not concept of stop becoming a customer. Once they purchase something they become customers forever.

So, practically speaking potentially we may need to store subscribers, registered users, purchaser and prospects as customers. What we do in the data warehouse is to build the customer dimension based on several different sources in the operational systems: order data, subscription and permission data, registration data and marketing data. We can use overlay technique I described in this article: Upsert Dimension Table. Yes it’s true that this will mean that we will have many attributes on the customer dimensions. And yes we do need to deduplicate the data, for example based on customer name, date of birth and address, or email address.

Single customer view simply means that we need to build a customer dimension that is solid, i.e. no duplication of data, complete (no missing customers) and correct. Deduplicate is not always easy, for example name can change when women get married, address changes when they move houses and email address changes over time too (a hard bounce could be an indication). Hence we need to use other measures such as time frame or data age (e.g. we can use the data if it is no more than 1 year old) or using other criteria such as Social Security Number, date of birth, etc. MDM and CDI vendors have a lot experience in this area, as well as data quality and data profiling software such as Trillium.

Permission Management

Regulations differ from country to country, but the basic principle is we can only send campaign to customers who have already given us their permissions to send it to them. Based on the scope, there are 2 kinds of permissions: subscription-based and general permission.

In a subscription-based scenario, we receive requests from customers for sending them a particular communication. For example, say we have 3 communications: weekly newsletter, low price alert and special offers. Customers can subscribe to these communications, either only to 1 communication, 2 communications or all 3 communications. In this case, if the customer subscribes to the low price alert, we can only send them low price alert, we can not send them other communications. If we have a new communication, we can not send it to subscribers of other communication. Instead, we need to build a new subscriber base for the new communication. Subscriber base is created by getting end users to subscribe to particular communication through website or promotions.

In the general permission type, customers do not subscribe to a particular communication. Instead, they give permission for us to send them a particular type of communication. For example, say we have 2 communication types: promotional items and subscription items. In this case the subscription items cover everything that is regular and promotional covers everything that is ad hoc. Another example: we can have 3 communication types: transactional (such as order confirmation emails and e-tickets), marketing (such as promotional campaign) and third party (such as campaign from our sister companies).

Permission is also categorized based on the communication channel or media, for example: by email, by SMS/text, by post, by telephone and by RSS feed. There is also a permission for sending (or not sending) any kinds of communication to customers. For example, a customer could call or email us asking to be excluded from any forms of communications.

For a multinational company, the permission is could be per locale. It is not necessarily country based, for example: Benelux, Nordic and Scandinavia are often mentioned as one locale, even though they consist of several countries. In those cases 1 office serving more than 1 country. It is possible that each locale has more than 1 language. We could also have several brands or names. For example, we could be operating in a certain country using 3 different company names, each having their own monthly newsletter or promotional campaign. Permissions can also have a validity period, e.g. only for 1 year. We need to take locale, language, brand and validity period into account when constructing the permission fact table and communication dimension.

Let us discuss the design. Permission data is ideally stored in a fact table, with all the above items as the dimension, plus the customer key, date the permission was given, source key and the permission ID as degenerate dimension if applicable. The measures or facts are number of subscribers, subscription price, There are 2 possible fact tables: subscription fact table if you use a subscription based permission, or general permission fact table if you use the general permission approach described above. For an example let us discuss the design for subscription based type.

Fact table name: fact_subscription
Grain: each time a customer subscribes or unsubscribes to a communication.
Type: snapshot accumulative
Table creation script (ignoring partition for the time being):
create table fact_subscription (
customer_key int not null, — who the customer is
communication_key int not null, — what communication the customer is (un) subscribing to
channel_key int not null, — what media will be used (email, post, text, RSS)
promotion_source_key int not null, — which promotion is the source or cause of this subscription
brand_key int not null, — which brand managing this (un) subscription
locale_key int not null, — which locale managing this (un) subscription
language_key int not null, — which language this (un) subscription event originated from
expiry_date_key int not null, — the date when this subscription is valid until
subscription_period_key int not null,– how long the subscription is valid for e.g. 3 months, 1 year
permission_id varchar(20) not null, — degenerate dimension from front end CRM system if applicable
subscription_price money null, — how much this subscription costs
number_of_subscriptions int null, — 1 for a normal subscription, 0 for dummy
subscribed_dt datetime null, — date and time the customer subscribed
unsubscribed_dt datetime null, — date and time the customer unsubscribed, NULL if still subscribed
is_active_flag tinyint not null, — 1 if the subscription is active, 0 if it is expired or unsubscribed
created_dt datetime not null, — system date & time when this fact table record was created
last_updated_dt datetime not null — system date & time when this record was last updated
constraint pk_fact_subscription primary key clustered (customer_key, communication_key, channel_key, subscribed_dt))

To get the subscribers’ email address and customer name of Ivory weekly email campaign:

select email_address, customer_name from fact_subscription sub
join dim_customer cus on cus.customer_key = sub.customer_key
join dim_communication com on com.communication_key = sub.communication_key
join dim_channel ch on ch.channel_key = sub.channel_key
where com.communication_name = “Ivory Weekly”
and ch.channel_name = “Email”
and sub.active_flag = 1

We could store the date the permission is given as a dimension, but we would recommend storing the timestamp on the fact table for 2 reasons: a) we don’t loose the time of day element and b) it is easier to retrieve the timestamp data for campaign segmentation. It is not advisable to store the permission in the customer dimension because it would limit the grain to be per customer basis, rather than per customer, communication and date.

So permission management is the capability of a CRM data warehouse to store the permission, based on all of the items described above. And to always keep them up to date. The permission data needs to be made available to the campaign management system to support the campaign segmentation process. It is frequently used, i.e. every time the users create a campaign.

Campaign Segmentation

When creating a campaign, we need to have a list of customers to whom we are going to send it to. These end users are known as campaign target audience. Campaign segmentation process produces this list. Most CRM software has this capability. This is where the SCV play an important role. The richer the customer dimension, the more flexible we can create the segmentation. Segmentation criteria that are commonly used are:

  • Permission
  • Demographics
  • Order data
  • Campaign delivery
  • Campaign response
  • Customer loyalty score
  • Customer profitability

We will give an example on each of the above items so we are clear about what they are. Permission: all customers who subscribed to Norwegian weekly newsletter in the last 3 months. Demographics: all female customers age 20 to 40 who live in Milan. Order data: top 1000 customers (by order value, excluding VAT) who have purchased electronic products from us in the last 12 months. Campaign delivery: exclude customers who had more than 3 hard bounces more in the last 8 weeks. Campaign response: include all customers who have opened the last campaign. Customer loyalty score: include customers from the top tier with more than 500 loyalty points. Customer profitability: include all customers from band A with annual order value > $30,000.

Campaign Results

What we meant by campaign results are:

  • Campaign delivery data, i.e. whether the campaign successfully reaches the target audience. For example, say we have an email campaign with 100,000 target recipients. Because of invalid email addresses, we only sent 99k and did not send 1k. Out of these 99k that went out, 96k were delivered to the target recipients’ mail boxes and 5k were bounced. All this information is called campaign delivery data.
  • Campaign response data, i.e. reactions from the customers receiving the campaign, perhaps by clicking on a link if it is an email campaign, or calling customer service center if it is a postal campaign.
  • Orders resulting from the campaign, i.e. out of the customers who responded to the campaign, how many actually placed their orders, what did they purchase and what are their order values.

Let’s go through these 3 points one by one.

Once campaign segmentation is ready, CRM system executes a campaign and sends it to target audience. Data about to which customers the campaign were successfully delivered and to whom it was not delivered, along with the reason why it was not delivered, should be fed back by the CRM system to the data warehouse. We are not only talking about email campaign here, but also by post, by telephone, by text messages and by home page customisation. This campaign delivery data (i.e. sent, not sent, bounced and delivered) will be useful for future campaigns. One possible design for storing campaign delivery data in the data warehouse is a factless fact table, with customers, communication, date, channel, delivery status and reason as the dimensional keys.

Specific to email campaigns, when the campaign reaches the target audience email box, end user may open that campaign email and perhaps click on any particular offer in that campaign. These open and click through events are fed back to underlying data warehouse. No mechanism is 100% reliable, but one mechanism for logging open events is a transparent 1 pixel image inside the body of the email, with a query string containing a customer identifier on the image tag. The web log of this image is then processed daily and the hit of this image, along with the customer identifier and the timestamp, is stored in the campaign response database. A mechanism for logging click-through events is redirection, i.e. the link on the campaign email hits a landing page with the destination page URL and a customer identifier in the query string. A script behind the landing page then records the time of the event, the URL of requested page and the customer identifier into the campaign response database before redirect the user to the real page. The campaign response database is then fed back by the ETL to the data warehouse to be used for future campaign as additional criteria when doing segmentation.

Open event could be stored on the same fact table as the campaign delivery data. An example of design for a campaign delivery data that contains the open data is below.

Fact table name: campaign delivery
Fact table type: snapshot accumulative
Grain: 1 row for each communication sent to each customer
Creation script:
create table fact_campaign_delivery (
customer_key int not null, — who the customer is
communication_key int not null, — what communication is delivered to the customer
channel_key int not null, — what media is used (email, post, text, RSS)
delivery_status_key tinyint not null, — 1 if successfully delivered, 0 if failed
reason_key int not null, — a positive integer containing failure reason, 0 if successful
open_status tinyint not null, — 1 if opened and 0 if not opened
number_of_opens int not null, — normally 1 if opened but can be more than 1
sent_dt datetime not null, — date and time the communication was sent to this customer
delivered_dt datetime null, — date and time the delivered was logged
bounced_dt datetime null, — date and time bounce event was logged
first_opened_dt datetime null, — date and time the message was first opened, NULL if not opened
created_dt datetime not null, — system date & time when this fact table record was created
last_updated_dt datetime not null, — system date & time when this record was last updated
constraint pk_fact_campaign_delivery primary key clustered (customer_key, communication_key, channel_key, sent_dt))

Because number of opens can be more than 1, if you want to record the date and time of the second and subsequent opens, you will need to put open in its own fact table, separate from campaign delivery. Normally it does not really matter that we don’t get the timestamp of the 2nd and subsequent opens.

Click-through can not be stored on the above fact table because the grain is different. The grain of click-trough is 1 row for each link clicked on a communication. An example of design is given below.

Fact table name: clickthrough
Fact table type: snapshot accumulative
Grain: 1 row for each link/URL on a communication, whether clicked or not clicked.
Creation script:
create table fact_clickthrough (
customer_key int not null, — who the customer is
communication_key int not null, — what communication is delivered to the customer
channel_key int not null, — what media is used (email, post, text, RSS)
URL_key int not null, — which URL on the communication
click_status tinyint not null, — 1 if it is clicked, 0 if not.
number_of_clicks — how many times the URL was clicked by the same customer
first_clicked_dt datetime null, — date and time the URL was first clicked, NULL if not clicked
created_dt datetime not null, — system date & time when this fact table record was created
last_updated_dt datetime not null, — system date & time when this record was last updated
constraint pk_fact_campaign_delivery primary key clustered (customer_key, communication_key, channel_key, URL_key))

If you want to record the time the 2nd click happened, put each click on separate row and make the fact table type transactional. But normally it does not really matter, as long as we know the number of clicks. It is preferable to put each URL (whether it was clicked or not) in 1 row like above, with the number of clicks as a measure.

Some of the customers who responded to the campaign might place their orders. These orders are tracked using promotional code if it is a postal campaign, or using a identifier on the offer link if it is an email campaign, or using a standard software package such as Omniture SiteCatalyst if the order is placed online. Data that are normally fed back into the data warehouse to be used in future campaign are who the customer is, which campaign it is resulting from, and the usual order attributes such as product type, quantity and values. This way it would enable us to analyse campaign effectiveness, analyse customer behaviour and to monitor how much of the company revenue is generated from CRM activities, which could be used for ROI calculations or backing the proposal for future projects.

Customer Analysis

Various types of customer analysis could be performed in the data warehouse. To give you some ideas, below are some examples.

  • Purchase pattern analysis, i.e. what kind of products or services does a particular group of customers purchase. The groupings could be based on demographics or campaign results. Based on the patterns we could try to forecast future purchases and relate it with inventory management, in particular the reorder level and purchasing lead time.
  • Price sensitivity analysis, i.e. identifying changes in shopping and purchasing behaviour if the price changes. In this case we also group the customers for analysis, not individual customers. We try to identify if there are certain patterns which would be useful for setting future marketing strategies and operational directives.
  • Shopping behaviour (especially for online businesses), i.e. identifying the factors associated with site traffic to measure the effectiveness of site design, checkout process design and help increase conversion rates. Shopping behaviour analysis is also conducted to gather the customer interests (which pages on the online store the individuals are more interested with), to be included as a factor when doing personalisation exercise such as site personalisation and personalised offers.
  • Customer attrition analysis, or customer churn analysis, i.e. to answer the questions such as how many customers defected from us each week or month, how many new customers we are getting each week or month, what kind of customers we are loosing (in terms of profitability and demographics) and what kind of new customers we are gaining (in terms of product or service range and demographics). Also included in this kind of analysis is changes in the type of service or product that the customer is having (this does not apply for supermarket but it is applicable for health care and financial services, for example the type of account).
  • Customer profitability analysis, i.e. revenue that we receive from the customer minus the costs associated to that customer, over a certain period (say weekly or annually). We want to know which customers we are loosing money from, and which customers are making money from. The formula to calculate the revenue side is normally not difficult but allocating the cost to each customer activity is technically and politically not easy.
  • Fraud detection, i.e. large increase in credit card purchases which deviate significantly from the individual or group normal pattern (for financial service industry); unusual returns of goods by the same customer (identified by name, post code and customer card number) within short period of time, compared with the daily and seasonal behaviour of the product line of suspected product code (this one is for retail industry); spiky account balances and unusual withdrawals/deposits (for banks), drops in recent invoice values not accompanied with lower usage activity (for telecom industry). Another method is to use 2 (or more) groups of samples, one containing the fraudulent transactions and the other representing good transactions. These groups are then fed into the mining model for training.

Each type of analysis requires different data model, and different industry requires different data model. For example, customer profitability fact table in utility sector (gas and electricity in particular) could be an incremental snapshot type, containing monthly snapshot of all accounts monthly revenues (calculated based on service types, rates and usage) and proportionate cost structure for the same period of time. The revenue may be per kwh but the base cost may be by weight (tons of coal) which makes the equation non-linear hence for some customers we could be making a loss and for others we are making handsome profit.

Although dimensional model can do a lot of analysis, in some cases we have to use multidimensional models, i.e. cubes. Many types of customer analysis especially those that involve predictive analytics, behaviour recognition, statistical analysis, non-linear estimation, cluster analysis and patterns finding, would require data mining running on multidimensional database, sometimes more than 1 MDB/cube. Some analysis would require building applications running multidimensional queries on cubes.


What we meant by personalization is tailoring our web site, products, services, campaigns and offers for a particular customer or a group of customers. There are large categories of personalisation: 1) we ask the customer what their preferences are, or 2) we guess their preferences based on their shopping behaviour, purchase history and demographic attributes. Once we know (or we think we know) the customer preferences, we offer them our products and services which we think would suit their preferences. Examples of personalisation are:

  • Price and product alerts, i.e. we let the customer know if what they like appears in our data warehouse. Price alerts are notification to the customers when there are special offers (lower price) on certain products or services that satisfy their criteria, for example if they would like to fly to certain cities or purchase certain type of digital camera. Product alerts are notification to the customers when a certain product appears in our database. For example: they declare their favourite singer or musical preferences, then we notify the customers when a certain album or single that suit those preferences appear in our database. The basic working principle is matching: on the one hand we have many suppliers supplying us with thousands of products and services every day and on the other hand we have a lot of customers with certain preferences. All we have to do is to match them automatically.
  • Personalised offers, i.e. we offer our customers certain products or services that we think match their needs or profile. There are 2 broad categories on how to choose the products or services: a) based on their past purchases (or shopping/browsing if it is an online store), and b) based on the customer attributes. Example of past purchases: because a customer purchased Canon S300 ink jet printer 3 months ago, they may need BCI-24 colour ink cartridge today. Example of customer demographic attributes: the customer had a 3 months old baby so she may need baby products. For online stores and online services, customers could be identified by using cookie or asking them to login and once identified we could track their shopping behaviour, i.e. which product or service category they are spending a lot of time on, etc.
  • Recommendations, which is basically the same as personalised offer. But this term is normally used when the customer is still shopping on the web site (for online businesses), unlike the term ‘offers’ that normally used when they are not shopping, i.e. via email or post. Recommendation tends to be targeted to one customer, where as personalised offers can be targeted to a group of customers that satisfies certain criteria, for example, those in certain age range or live in certain cities.
  • Site personalisation (specific to online businesses), i.e. the web site contains different products and services (and prices) depending on who the customer is. There are 2 methods which are widely used to identify the customer: login and cookie. Login is the most (if not the only) certain way of identifying who the customer is, i.e. by supplying credentials, such as user ID and password. Serving the same purpose as login are: bio metric ID (such as finger print), challenge response device (such as a device that displays different response numbers every time it is activated, based on certain seed number which has been planted into the device) and security token (such as security card). Using cookie is probably 50-60% at best, never achieve 80% certainty. Some people disabled cookies on their browser, some installed certain plug-in on their browser which prevents cookies, some people regularly cleaned their temporary Internet files including cookies and of course some people don’t use their own, permanent computer, i.e. Internet café, a friend’s house, a shared home computer, an office or campus computer, library’s PC, etc.

The content of site personalisation may be generated by a CRM system (as an XML), by setting up a campaign that is executed once a day. The logic behind this campaign does a data mining on a multidimensional data warehouse or, if we prefer a simpler way, by running a rule-based logic stored as metadata against the dimensional data warehouse. These rules are conditional rules, e.g. similar to IF … THEN … statement but with a lot of IFs. Price and product alert do not need a dimensional data warehouse. They can run on a 3rd normal form ODS. Or even on the front end CRM system.

One of the logic behind personalised offer (and recommendation) is ‘what similar people are interested in’. ‘Similar people’ can be quite a challenging term to implement. Some of the most popular classification techniques are nearest neighbour, neural networks and classification trees. Nearest neighbour is classification of customers based on their position in multidimensional space. Imagine that each dimensional attribute or each distinguishing factor that contributes to the grouping is a vector or arrow. The direction of the arrow is determined by the value of the attributes. A customer is defined by joining all the arrows by putting the beginning of the next arrow at the end of the previous arrow. This way a customer consists of all their dimensional attributes. Customers that are close to each other are classified as “similar”. Close or far is defined as multidimensional distance, i.e. square root of sums of all components’ squares. The difficult thing to do here is assigning numeric values to the dimensional attributes. As we all know dimensional attributes are mostly non-numeric. If the dimension has a hierarchy (such as city or location) then the numeric score depends on whether they have the same parent or grand parent.

Classification trees method is using a diagram where a branch has 2 sub branches. At each branch whether we go to sub branch 1 or sub branch 2 depends on the value of the attributes which is compared to certain criteria (normally a constant). Starting at the trunk, after following all the branches and sub branches we will arrive at the leaves. Now if we bring say 1 million customers to through these paths, some of them will end up at leaf 1, some at leaf 2, some at leaf 3, etc. The leaves are what we call classes. A customer is said to be “similar” if they are in the same class, or a near by class, which is defined by the number of levels.

Customer Insight

Customer information is not useful without its intelligent analysis. Analysis is always evolving and finding new ways to increase revenues through customer insight. Customer data plays vital role to build customer insight. Customer insight is a model to view available customer data and to analyze customer behaviour over period of time.

Using a data warehouse one can create rich customer dimension and use it to create customer insight. Business analyst can analyze complete customer data set in following ways:

Customer shopping Analysis:
Using historic order data, customer-shopping behaviour is analyzed. For example, business analyst can find answers to all the following questions by doing customer data analysis:

  • How many times customer has purchased from us?
  • What is time gap between two consecutive purchases?
  • What is the purchase pattern?
  • What product has he purchased most?

By answering above questions business user can understand customer-shopping behaviour and can design future marketing strategy to retain that customer.

Customer permissions analysis:
Permissions play vital part in defining customer data. Every enterprise has different set of rules to define these permissions. These permissions are stored in data warehouse for future analysis of customer permissions. For example, as described above in permission management section customer/subscriber can subscribe to different communications types over time period and all these historical subscription events are stored in subscription fact table. Business user can use CRM tools to do subscription analysis for different subscribers over period of time.

Customer Loyalty Scheme

Customer loyalty scheme is the way to reward high valued customers and build loyalty among customer bases. Many enterprises use customer scoring/point based system to build loyalty-based program. Customers are scored based on their previous shopping behaviour and points are calculated accordingly. Customer scores can be stored in customer dimension. CRM system uses these customer scores to design campaigns and group customers as per their loyalty points. Different customers are offered different promotions as per their scores.

Below we illustrate which tool is best to serve various CRM functionalities: data warehouse (DW), online transaction processing system (OLTP, i.e. front office application) or Operational Data Store (back office integrated operational database in 3rd normal form). If the cell does not have a ‘Yes’ in it, it does not mean that the tool can not do the functionality. It may be possible, but it’s not the best tool to serve that purpose.

Customer Support

Customer support is one of the important aspects of CRM industry. Many companies use various CRM tools to build customer support systems. Support system helps to solve customer queries, provide them promotional updates etc. For e.g. Customer call center to update billing address or phone number etc.

Many companies use ODS to store latest customer’s data to provide quick and efficient search capability to fetch up customer’s information. The underlying ODS database can be populated from OLTP databases or from data warehouse (in rare cases) for latest customer information. Many CRM vendors provide tools and techniques to transfer data between ODS and data warehouse. ODS can be populated from data warehouse/ Data Marts for customer specific data which is not persisted in OLTP databases.

Functionality OLTP DW ODS
Single Customer View
– Subscribers
– Bookers
– Registered users
– Customer matching
Yes Yes
Permission management
– Subscription based
– Tactical campaigns
– ISP feedback loop
– Communication preferences
Yes Yes Yes
– Order data
– Demographic data
– Campaign delivery
– Campaign response
– Customer loyalty score
– Customer profitability
Yes Yes
Campaign Content
– Promotional Offers
– Routine Newsletter
– Purchaser Lifecycle
– Subscriber Lifecycle
– Cross-selling
Campaign Results
– Delivery rates
– Open rates
– Click through rates
– Conversion rates
Yes Yes
Customer Support
– Complaint Handling
– Cross selling
– Pre-consumption support
– Consumption support
– Emergency support
– Post consumption
Yes Yes
Customer Analysis
– Purchase pattern
– Price sensitivity analysis
– Shopping behaviour
– Customer attrition analysis
– Customer profitability analysis
– Fraud detection
– Alerts/Notification
– Special Offers
– Recommendations
Yes Yes
Customer Loyalty Scheme
– Scheme Administration
– Customer scoring
– Classification
– Satisfaction survey
Yes Yes
Order processing
– Quotation
– Registration
– Custom pricing
– Placing orders
– Contract management
– Order confirmation
– Invoicing
– Payments
– Refunds
– Arrears
– Account management

Vincent Rainardi and Amar Gaddam
28th December 2006

(This is a repost from SQLServerCentral)


  1. Are there specific open source software programmes that are used in data warehouses?

    Comment by Clay Vickers — 5 August 2013 @ 10:17 pm | Reply

  2. Hi

    I have doubt on fact subscription, If i want to generate a report to display all active customers who has subscribed for particular news letter. Offcourse, its simple query select Customersids from fact_subscriptions where activity=newsletter. It fetches all customers but if the customer unsubscribes next minute, there will be two rows in fact one for active and second when it is unsubscribed. How do we retrieve only active customers?

    Comment by bi — 19 February 2014 @ 1:05 am | Reply

    • subscribed_dt datetime null, — date and time the customer subscribed
      unsubscribed_dt datetime null, — date and time the customer unsubscribed, NULL if still subscribed
      is_active_flag tinyint not null, — 1 if the subscription is active, 0 if it is expired or unsubscribed

      Hi the is_active_flag column indicates whether the subscription is stil active, or the customer has unsubscribed.

      If a customer subscribed then unsubscribed, we don’t have 2 rows. We only have 1 row, with the unsubscribed_dt populated, and is_active_flag = 0.

      Comment by Vincent Rainardi — 19 February 2014 @ 7:40 pm | Reply

      • Which means we are going to update fact table aren’t we?

        Comment by bi — 24 February 2014 @ 12:18 am

  3. Hi thre, you seems bit busy can you pelase let me know if we are updating the fact table?

    Comment by bi — 7 March 2014 @ 5:10 pm | Reply

    • Hi, what do you mean?

      On 7 Mar 2014, at 17:10, Data Warehousing and Business Intelligence wrote:

      Comment by Vincent Rainardi — 7 March 2014 @ 6:53 pm | Reply

  4. I mean, Are we going to update fact_subscription fact table?

    You said, there will be only one record when Customer subscribe and unsubscribe, “If a customer subscribed then unsubscribed, we don’t have 2 rows. We only have 1 row, with the unsubscribed_dt populated, and is_active_flag = 0”

    Thanks in advance

    Comment by BI — 10 March 2014 @ 10:09 am | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at

%d bloggers like this: