October | 2012 | Data Platform and Data Science

3 October 2012

SSIS: Automating DDL Changes

Filed under: SSIS — Vincent Rainardi @ 5:58 pm
Tags: SSIS

See Preeti’s comment 3/10/12 on “SSIS: Importing a File with Dynamic Columns” article (link):

Will elaborate my question ?I have columns dynamically changing on source side like adding,deleting which is SQL ,have my target as Oracle ,how to handle dynamically the source changes to be in sync with target (Oracle) structure before loading.I knw in any ETL process this has to be manually done like refreshing the source then updating the target table structure.Cannot this be automated?We have java transformation in ETL so cant we make use of it to automate the process ?If so then how?If not,are there any workarounds to do this?

This article is addressing the above.

—————————————————————

This can be done in theory, but in practice it is usually banned in production.

First, we obtain the new structure of the source table(s) by querying either information schema or system views/tables, i.e. if your source is SQL Server we query system catalogue views like this:

select c.name, dt.name, c.max_length, c.precision, c.scale, c.is_nullable
from sys.columns c
join sys.tables t on c.object_id = t.object_id
join sys.schemas s on t.schema_id = s.schema_id
join sys.types dt on c.user_type_id = dt.user_type_id
where t.name = ‘Table1’ and s.name = ‘Schema1’

We then obtain the structure of the target table(s) by querying either information schema or system views/tables, i.e. if your target is Oracle we can query ALL_TAB_COLUMNS like this:

SELECT COLUMN_NAME, DATA_TYPE, DATA_LENGTH, DATA_PRECISION, DATA_SCALE, NULLABLE
FROM ALL_TAB_COLUMNS
WHERE TABLE_NAME = ‘TABLE1’ AND OWNER = ‘OWNER1’

Both tasks above are done using Execute SQL Task in SSIS, storing it into a table (you create the table). For example:

As we can see, SubType and Value3Y doesn’t exist in the target table, and Value5Y doesn’t exist in the source table. The precision of Value1Y and Value5Y is different.

You can compare the source and target then use ALTER TABLE at the target to create the missing column, but this is risky, complex and causing headache. A simpler, more robust solution is to recreate the target table.

If you have different RDBMS between source and target, you will need to make a “data type conversion table”, i.e. varchar in SQL Server becomes varchar2 in Oracle, etc.

But as I said, this is usually banned in production because the Prod Support Manager can’t risk the instability the approach causes.