Data Warehousing and Data Science

26 May 2011

Many to Many Is Not Always Right

Filed under: Analysis Services — Vincent Rainardi @ 7:10 am

Suppose you have an account balance table:

Account    Balance
Account1    £10
Account2    £10
Account3    £10

And you have “Which customer owns which account” table:

Customer    Account
Customer1    Account1
Customer2    Account1
Customer1    Account2
Customer2    Account2

In other words, Account1 are owned by customer 1 and 2. Same with Account2.

How much money does customer 1 have?
How much money does customer 2 have?

Assuming that customer1 doesn’t have money in another bank or at home or in his wallet, then the correct answer is: we don’t know. The £10 in Account1 could be split £3-£7 between customer1 and customer2. Or it could be £2.50-£7.50. We don’t know.

If we do this as M2M in SSAS, what does SSAS think?

Cube structure:

Why £20 for customer1? Well customer1 owns both account1 and account2, and there’s £10 on each account hence £20.

Why is the total £30? Hmm, that’s where I can’t understand. SSAS sums all the accounts, 3 x £10 = £30. SSAS should be summing account1 and account2 only (2x£10=£20) or summing what on display, i.e. customer1 & customer2 (2x£20=£40). But no, it displays £30.

To proof that it sums all 3 accounts, I’ll change the balance of account3 from £10 to £7, and reprocess the cube.

The total is now 27.

What does account3 have to do with customer1 and 2? I don’t know. Hence the title of this article, Many to Many Is Not Always Right.

If anybody can explain this, I’d be glad to hear from them. In the mean time I’ll also try to find out.

One more thing, if it’s not many-to-one (many customers own one account), but one-to-many instead (one customer owns many account), then the balance for each customer is correct. Of course it is! There’s no dispute about how to proportion the balance of an account to 2 people! For example:

Customer    Account
Customer1    Account1
Customer1    Account2
Customer2    Account3

In this case SSAS displays:

But, as you notice, the total is still wrong! Well it doesn’t make sense that the numbers above it are 20 and 10 and the total is 27, does it?

Vincent Rainardi, 26/5/11, vrainardi@gmail.com

PS. The script to create and populate the tables is below.

```use Test
go

if exists (select * from sys.schemas where name = 'M2M')
drop schema M2M
go

create schema M2M
go

-- Create account balances fact table
if object_id('M2M.Fact_Account_Balances') is not null
drop table M2M.Fact_Account_Balances
create table M2M.Fact_Account_Balances
( Account_Key int,
Balance money
)
go

insert into M2M.Fact_Account_Balances values (1, 10)
insert into M2M.Fact_Account_Balances values (2, 10)
insert into M2M.Fact_Account_Balances values (3, 10)
go

-- Create account dimension table
if object_id('M2M.Dim_Account') is not null
drop table M2M.Dim_Account
create table M2M.Dim_Account
( Account_Key int,
Account_Number varchar(20)
)
go

insert into M2M.Dim_Account values (1, 'Account1')
insert into M2M.Dim_Account values (2, 'Account2')
insert into M2M.Dim_Account values (3, 'Account3')
go

-- Create customer-account factless fact table
if object_id('Fact_Customer_Account') is not null
drop table M2M.Fact_Customer_Account
create table M2M.Fact_Customer_Account
( Customer_Key int,
Account_Key int
)
go

insert into M2M.Fact_Customer_Account (Customer_Key, Account_Key) values (1, 1)
insert into M2M.Fact_Customer_Account (Customer_Key, Account_Key) values (1, 2)
insert into M2M.Fact_Customer_Account (Customer_Key, Account_Key) values (2, 1)
insert into M2M.Fact_Customer_Account (Customer_Key, Account_Key) values (2, 2)
go

-- Create customer dimension table
if object_id('M2M.Dim_Customer') is not null
drop table M2M.Dim_Customer
create table M2M.Dim_Customer
( Customer_Key int,
Customer_Name varchar(20)
)
go

insert into M2M.Dim_Customer values (1, 'Customer1')
insert into M2M.Dim_Customer values (2, 'Customer2')
go

select * from M2M.Dim_Customer
select * from M2M.Dim_Account
select * from M2M.Fact_Customer_Account
select * from M2M.Fact_Account_Balances
update M2M.Fact_Account_Balances set Balance = 7 where Account_Key = 3
```

1. Rainard,

as Microsoft says, “it is by design”
When you don’t have a filter on Customer, you are looking at the total without filters. This is the consequence of the technique PivotTable use to display the result.
I don’t remember whether the display visual total would work or not, but maybe it doesn’t work well in this case with a M2M.
In case you want to see the value of the group formed by Customer1 and Customer2, you have to force a selection in the resulting MDX query (for example by creating a set or a calculated member in the customer dimension). The problem is that, with large amount of data, there is a big performance impact.

Marco

Comment by Marco Russo — 26 May 2011 @ 10:59 am

2. Thanks for this post, it made me stop and question my design on a project I’m working on. A couple of thoughts…

The balance for each customer is correct, since each customer is an owner of any account they are on then they are each entitled to the full balance… This is how the banks look at it. The only dispute comes in to play when they each want to withdraw the full balance. At that point other business rules come in to play, such as first come-first served. The important thing is to have the right balance at the grand total, which it does…

I believe the confusion comes in because of a design problem. Because the M2M bridge dimension is missing a relationship… You have an account with no customer, but no “unknown” customer record. In a normal dim-measure relationship SSAS will provide an unknown entry, but not for a M2M. Redesign your bridge table. I just ran into this myself and had to provide an unknown entry in the faceless table to account for this. Hope this helps!

Comment by John — 26 May 2011 @ 9:39 pm

• This is absolutely right, it’s exactly what I wanted to write too. In the ETL process, you need to make sure that every account has an owner, whether real or fake/unknown owner.

Comment by Miky Schreiber — 13 June 2011 @ 1:30 pm

3. […] before I start, 2 things. 1) This post is related this post I wrote 2 days ago: Many to Many Is Not Always Right. 2) Thank you to Robert [last name] for […]

Pingback by Many-to-Many: Which Dimension is Used? « Data Warehousing and Business Intelligence — 13 September 2012 @ 6:13 pm

4. As I understand it, this is the feature that SSAS provides. It is not double-counting, it is giving the correct total. If I did this through a standard sql join I would get \$40 total for all accounts, which is incorrect. This is what many other OLAP tools (like Cognos) do not handle. SSAS is being smart for us here.

Comment by Michael — 19 February 2016 @ 4:44 pm

• In other words, customer 1 is associated with \$20, and customer 2 is associated with \$20. The total dollars for all accounts associated with customers 1 and 2 is \$20. The grand total for all accounts is \$30.

If you filter your MDX for just those two customers you should see a grand total of \$20, and each customer has \$20. Which would look strange in a report at first glance, but is correct.

Comment by Michael — 19 February 2016 @ 4:49 pm

Blog at WordPress.com.