Data Platform and Data Science

2 April 2010

Multi Language Cubes

Filed under: Analysis Services — Vincent Rainardi @ 6:14 pm
Tags:

Say you need to build a cube for Germany, France, Spanish and UK users. Would it be better to a) Create 4 cube translations (well 3 actually, as 1 is a “default language”), or b) Building the cube on the fly and set the attribute & measure names in a specific language?

If a specific cube is for a specific country and that country only uses one language (i.e. Germany, France, UK), and the cube is already built on-the-fly, then yes you can set the attribute & measure names in a specific language. Also, if some of the attributes & measures are not known in advance (as in, they are known only at run time), then yes, set the attribute & measure names at run time, possibly from a “translation table” stored in the database.

For all other cases, which is probably 99% of the time, use cube translation. The advantages are:

1) The same cube is simultaneously available in multiple languages.

This is useful in 2 ways: 1) some countries like Switzerland and Belgium are multi language, and more importantly, 2) many companies have offices in many countries. If your company is one of these international companies, then the cube needs to be viewable in different languages, depending on the user location.

2) You only have to create one single cube, not 1 cube for each country. This means that maintenance is a lot simpler. Imagine if you have 4 cubes, one in Spanish, one in French, one in English and one in German. Every time you want to make a change, you would have to make the same change 4 times. Unless the cube is built programmatically, it would cause a lot of head ache to change it many times over manually.

The advantage of storing the attribute and measure names in the database and use AMO C# to “make a copy of the cube and rename all the attributes and measures” is, obviously, the flexibility. Translation phrases are maintained in 1 central place (master data management) and used to build the cube. If you need to change a phrase, you correct the data in the MDM, press a button and the cube attribute names are changed. The other advantage is consistency, because it’s maintained in 1 central place.

But, this, in practice, is easier said than done. Because once a cube is in production, to change the dimension/attribute/measure name we’ll need to generate a change XMLA script, and schedule it to be executed just before the daily processing. And, this is the show stopper, all the reports needs to be changed too. Doesn’t matter if you use NovaView, ProClarity, Tableau, Excel 2007, Analyzer, or SSRS Report Builder (no, QlikView does not read cubes) to create the reports, you need to update them. And practically speaking you could have tens or hundreds of reports in the company. Because, unlike “canned reporting” which is created by IT, cube reports are usually created by the users. Which means, you (or your users) have a lot of reports to update. An activity which is both time consuming and tedious.

This is so important I’ll repeat it in it’s own paragraph:

Practically speaking, you can’t change the dimension/measure/attribute names of a cube which has been in production for a long time, because you have a lot of reports to update.

The issue of building a multi language cube is usually not the attribute and measure names. Most of the time the issue is the data, which needs to be available in several languages. This issue is that the data is usually only available in 1 language. Granted, translations for month names and day names are probably easily obtainable. Browse the net for 2 minutes and you’ll find them. But business specific data such as location groupings, risk classes, product descriptions, sizes, billing types, department names and company structures are difficult to get. What usually happens is the development team being pragmatic and just use whatever data is available, which means that the cube only contains data in 1 language. Which means, a single language cube.

As a side note, building a cube on-the-fly (as in, at run time using AMO) only works if the cube is relatively simple. If you have to create many-to-many relationships, partition slices, DSV which looks like a spider web, complex calculated measures and dynamic sets, various aggregations, then it’s better to build the cube on BIDS first, then customise it at run time using C#. Ditto with building SSIS packages on-the-fly (as in, at run time using C#). It only works well if the package is simple. If it a straight forward loading of a text file into a table with the same structure (as in, similar to bcp loading) then it’s doable. But if it’s a complex package with lots of scary-looking workflows then it would be better of building it manually on BIDS then use C# to customise it.

As usual, I welcome comments and questions at vrainardi@gmail.com.

Blog at WordPress.com.