Archives of the TeradataForum
Message Posted: Thu, 18 Jan 2007 @ 11:27:22 GMT
Subj: | | Re: Multiload Upsert Performance. |
|
From: | | Victor Sokovin |
| I have a general question about Multiload Upsert Performance. | |
| We are considering utilizing Multiload Upserts for the load strategy into many tables in a new subject area of the data warehouse. | |
| We only consider this due to a reduction in complexity for Audits and Control processes. Basically to cheat on the development time up
front In order to shorten our time to production. | |
| My reservation is performance degradation over time as the tables grow (some will grow rapidly, 100's of thousands of records per
day). | |
Eric, I am under the impression that in your question you include both Atomic (our hero Atomic, as discussed during the festive period,
performs two actions in one single pass through the table) and more prosaic but still useful cases of UPDATE and INSERT proper.
The thing to consider if you intend to use the WHERE clause in UPDATE and UPSERT is that you have to specify all the PI columns in it. This
requirement limits your options (if you intend to make the strategy described your sole ETL strategy, definitely spend some time on its
feasibility study!) but ensures the fastest possible access to the data so performance degradation should be exactly in line with the overall
performance degradation expected for the fast-growing tables.
As far as the performance itself (not the rate of its deterioration) is concerned I am not sure the bespoke strategy will be the champion under
all conditions. More traditional DML operations based on staging tables are serious competitors. But if your prime concern is the uniformness of
ETL processes and the ease of their auditing and the logic of your ETL processes suits the strategy then that is not an issue.
Just arrange for some testing. It should not be difficult to "age" the table very fast by repeating the loads and have an estimate of your ETL
times for the next couple of years.
Regards,
Victor
|