Data Platform and Data Science

3 September 2010

Standardising Entity and Column Names

Filed under: Analysis Services — Vincent Rainardi @ 12:48 pm

I was in a meeting this week about standardisation of attribute names throughout the enterprise. I realised that one of the most difficult duties of a data architect (or data architecture team) is to maintain a standard for the entity and column names. This includes the data warehouses and data marts (but not limited to just them).

This is a very difficult task that no body is eager to do. Because it is laborious and tedious. And because it involves a lot of business discussions, over and over. And because it is impossible to implement: after you have standardised 500 column names, changing many different systems throughout the enterprise is a multi year project, costing a few million, with no ROI. It would probably take only 2 minutes for the board to veto projects like that, mainly because of the ROI.

Because nobody wants to do it, this task is usually fall onto the shoulder of a newly formed body, called Data Governance team (or “council”). Some companies call it data quality team, which is not the same really. It is not as difficult to pitch the importance of having data governance team. Especially in industries that heavily dependant on data, such as financial services. Almost by definition, the members of this team should be quite senior. Amongst other things its main duty is to define the formal terminologies and business rules used by the business, and therefore used in IT. This team has the authority on everything related to data. They understand the business meaning, create data standards, and enforce the compliance to these standards.

Because of this “enforcing the compliance” business, you need very senior people to be on it. Also, you need people who are very sound in terms of business knowledge, aka the “Business Architect”. You also need senior people in DG team because they will need to sell the ideas of “projects without ROI” I described earlier. They have to convince other members of the board that there is a huge benefit of enforcing naming and rule conformance to all systems.

For large enterprises, data governance is a must. Not a luxury. It’s a necessity. International banking groups such as Citi, BOA and RBS, for example, definitely need data governance. If they don’t they are in trouble. For small companies, this reliability is on the shoulder of data architecture team. Their job is not just designing databases, creating ERDs, etc, but also maintaining naming standards.

Changing column names will certainly break all the applications accessing the database. These applications need to be changed. Impact analysis, however detailed, will not be 100% accurate. There will always be columns that we missed in impact analysis. E.g. we think app1 accesses 100 columns, but actually it accesses 120 columns. That’s why the testing must be thorough.

Yesterday was the fifth time I came across “naming standardisation” initiative. And based on the past experience, I’m a bit sceptical. If it’s only changing 1 or 2 apps, fine. If it’s only renaming the “front app” e.g. BO / Cognos / RS reports, that’s fine. But if it’s 500 columns and 2 warehouses, 8 marts, 5 source systems, used by 22 apps, hmmm… is it worth it?

As usual I welcome any comments and discussion at vrainardi@gmail.com

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.