Data Warehousing and Business Intelligence

14 November 2011

WhereScape RED

Filed under: Data Warehousing — Vincent Rainardi @ 7:41 am

About 4 years ago I had an idea (well, more of a dream really) of “building a data warehouse at the press of a button”. This means that everything is automated, including the development of the reports/cubes and the development of ETL. I understand that we can build SSIS, SSRS, and SSAS programmatically (and Informatica PowerCenter mappings too). I also understand that it is painfully slow and tedious process. But we can make the process easier by creating a few templates then at run time we choose the nearest match template and modify it.

This works for ETL, reports and cube. But how about tables? Corresponding dim and fact tables could be generated on the fly by analysing the source tables based on its data types: numerical columns become measures and textual columns become attributes. Yet, I admit it is a difficult process, definitely requires human involvement. Still, it’s easier than doing it from scratch. We present the “machine made” fact & dim tables to the designer, who will then “fix” the tables. And the “which source column goes where” would be automatically fed to ETL creation module.

This was I think what Kalido does. But it does not create the cube and reports. It only creates the ETL. If I’m mistaken here I’d be happy to be corrected. I do feel strange that someone automate the DW build but not the BI part.

Wherescape RED

Like Kalido (and Insource Data Academy), Wherescape RED creates the ETL automatically. It also minimizes the time for designing the warehouse tables. Dimensions can be built by drag-and-drop the source table. SCD type 2 is also automated. RED also builds the cube automatically and the MicroStrategy project too. Documentation is also automated.

This article is about Wherescape RED so I’m going to go into a little bit more detail about how it is done.

In high level, the steps are:

  1. Load source tables
  2. Create dimensions
  3. Create dimension views
  4. Create staging tables
  5. Create fact tables
  6. Create SSAS cubes
  7. Create MicroStrategy reports

Having used it myself (evaluation copy) I found that it is easy to use. Step by step tutorial with screenshots is provided to it’s easy to do step 1 to 6 above (I have tried number 7 as it requires MicroStrategy extension to be installed – not included in the evaluation copy).

This is how it looks:

Lots of operations are done by drag and drop, and fill in the properties. It’s user friendly, easy to use, and quite simple to understand (note: you need to understand data warehousing of course).

You can use SQL Server, Oracle, DB2 or Teradata for your warehouse platform. These four are very popular platform for data warehousing. In the next release they include Netezza. No RED does not do Informix, Sybase or MySQL data warehouses.

The question that some people have about RED is: “Is it only suitable for simple DWs?” No it isn’t. Whether it’s complex fact table or complex dimension, we can use RED to build it. It can build aggregate fact table, SCD, hierarchy, KPI. We can also modify the generated SQL script if we need to do any custom stuff. You can also define indexes and create SQL agent jobs/scheduler.

The other questions are about the platform: Can it do BO? Informatica? Cognos reports? No, unfortunately. Any reporting tool such as BO/Cognos/RS can be used, but not automatically. You have to build all the reports manually. The ETL is SQL Script (T-SQL in SQL Server, PL/SQL in Oracle), or SSIS package. Yes RED can generate SSIS packages to load the data from source to stage/load tables, but not from stage to fact/dim tables. It doesn’t generate code for any other ETL tool: Ab Initio, OWB, ODI, PowerCenter, etc. Only SSIS.

The main disadvantage of RED (this is what people often ask when evaluating), is that it is ELT. It uses SQL Server T-SQL/Oracle PL-SQL to load data from the load tables to the DW/dim model (SSIS only load to stage/”load tables”). It doesn’t do ETL. Only ELT. Celinio explains the difference between ETL and ELT (link). Rob Davenport from Insource /Data Academy provides excellent white paper explaining advantage/disadvantages of ETL and ELT in great length.

ELT has its own advantages: flexible, easy to maintain programmatically*, simpler, lower cost (use DB server, not separate server). *difficult to maintain manually because it’s lots of SQL code. But of course once RED generates the SSIS package you can add all the transformation you like.

Second disadvantage is obviously the platform. It does not do Informix, or Sybase data warehouses. Or Ingress, MySQL or Greenplum. RED will be able to do Netezza DWs in the next release.

The main advantage of RED is fast development time. It surely cuts a lot of development time. My estimate is that a warehouse which takes 3 months to build using custom development (say using SQL Server + SSIS or Oracle + Informatica), can be built in 1 months using RED. And after it is built, changing or expanding is also quick.

Second advantage is ease of use. It’s a mature product, been there for years, so it’s easy to use. I found by following the tutorial within a day I’m able to use the product to build a DW. Very short training time. Proper ETL tools such as SSIS/Informatica requires weeks of basic training, plus weeks of on the job training (or buy the skills at £400-£500 / day).

Third advantage is once it’s built you can leave RED and continue to do it manually, maintaining the generated tables and ELT scripts or SSIS packages yourself. Expanding it as you need to, and developing the SSIS packages as you need to. And building the reports using your tool of choice, e.g. RS, Cognos or BO.

4th advantage and this is what I like best as I work quite a lot with cube: RED generates the SSAS cube. And RED also builds MicroStrategy project, enabling excellent reporting capability which is very flexible.

Conclusion

In my opinion Wherescape lives up to its reputation. We can use it build a data warehouse in much shorter amount of time. And this, in my experience, is what every BI consultant need. Reports only take a month to build but the DW takes 6 months* to build. If you can cut it down to 2 months it will be a big advantage, money wise and time/PM too. *that’s a small one, a big one is 18 months. The biggest plus point is that we can expand it later on: further develop and customise the SQL Scripts/SSIS package, SSAS cube, MicroStrategy projects for reporting. Unfortunately it does not generate Informatica or SSRS/BO/Cognos, some of the most popular tools in DWBI.

As always I’ll be glad to receive comments or corrections.
Vincent Rainardi 14/11/2011

Advertisements

11 Comments »

  1. Hi Vincent,
    Nice blog, thanks for taking the time to have a look at WhereScape RED. You are right when you say we will be supporting Netezza, I can also say that we looking at supporting some of the other appliances such as Greenplum.
    We also need to get you a copy of the latest version for your screenshot! We have done a lot of work on the look and feel – it is still very much aimed at the developer, but we think it now looks a lot nicer.
    Regards,
    Michael Whitehead, Founder and CEO,
    WhereScape Software

    Comment by Michael Whitehead — 16 November 2011 @ 4:20 am | Reply

  2. Vincent,
    Kalido can in fact automatically create the BI metadata from the Kalido warehouse – Business Objects Universes, Cognos models and SSAS cubes are all built automatically via a wizard. You simply select the measures you want, select the rollup path (you may have more than one) and the as-of date and Kalido builds the BI metadata. Kalido also generates QlikView files for loading into memory and Excel reports. The page you linked to only discusses some of the net new capabilities in Kalido Information Engine version 9. I invite you to check out the rest of the site to learn more about Kalido.

    Comment by John Evans — 21 November 2011 @ 2:37 pm | Reply

  3. Nice post, Vincent. You may also want to take a look at LeapFrogBI.com. This is a revolutionary SaaS based product launched in early 2012 that streamlines data mart development. LeapFrogBI deploys to SQL Server and provides a platform that enables developers to speed development while also enabling unhindered customization abilities in native tools.

    Comment by Paul — 22 August 2012 @ 3:46 pm | Reply

  4. can someone tell me how does it exactly beneficial compared to ETL tools especially Informatica? As we know, we don’t need a separate ETL server since it is a ELT and automatic document generation. What else we have in terms of cost/resource saving?

    Comment by Uma — 24 August 2014 @ 6:00 am | Reply

    • Hi Uma, RED (and other automation tools) mesh together what are normally separate development tasks into a single task to reduce time and effort to deploy a project. For example when you define a table in RED (the structure) at the same time you define the processing rules for loading it (source – target / transforms) and descriptive information (documentation). When you click a button the table gets altered or created, the code to populate it gets generated and documentation updated. That is a productivity gain.
      Its more than that though – when you need to make a change, you regenerate all of those development tasks (structure, code, documentation) in the same task. And all projects need to incorporate change as the dev team and users react to the data / data quality exposed during development. This is a huge productivity gain.
      When using an ETL tool you often need specialists to get the most from it. They want requirements defined (source – target definitions) that they can work to match to the table(s) under development. If a change is required the ETL team want updated requirements documentation (to match changing table or rules definitions). This change takes time and project documentation – so initial development and subsequent changes are much, much slower. Then you get to the deployment and user / support documentation.

      Comment by Doug — 8 September 2014 @ 4:40 am | Reply

  5. Reblogged this on xdatam.

    Comment by jamesx — 28 January 2015 @ 12:38 am | Reply

  6. I have used WhereScape RED in one of the project and its quite good for medium size data warehouse development. Fast and easy, but its not optimum for enterprise level big complex project. Also, it cannot be taken as complete solutioning tool for DW development and deployment.

    Regards,
    Gogoi

    Comment by gainintel — 9 July 2015 @ 8:32 am | Reply

    • I know that WhereScape software is definitely used for some very large data warehouses in terms of data and complexity- there is no upper limit in the software. However using RED in those larger developments does need more project discipline and governance of standards from the beginning otherwise the environment can get harder to manage.

      Comment by Oldboy — 9 July 2015 @ 5:36 pm | Reply

  7. Can someone provide useful links where we can learn wherescape? I do not see much of useful links present for wherescape tutorial. It would be very helpful if I get one.

    Thanks for the help in advance.

    Comment by Nikita Kothari — 29 September 2016 @ 6:14 am | Reply

    • Contact info or sales at wherescape dot com to get access to online training.

      Comment by Mike — 29 September 2016 @ 8:30 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: