Archives of the TeradataForum
Message Posted: Mon, 28 Mar 2005 @ 20:51:35 GMT
Subj: | | Re: Purging Data from tables |
|
From: | | Howard Bradley |
Christian wrote :
| "we purge about 30.000.000 rows (one week data) once a week out of a 15 month transaction table (about 2 Billion rows)(CDRs). | |
| We use Multiload delete and the where clause is "where start_date <= date'YYYY-MM-DD'. start_date IS NOT the PI." | |
I have been following this thread with interest hoping one of the many suggestions would help us with a particular scenario that we have. From
the statement above I am guessing that the weeks worth of data removed from Christian's table is the oldest week, thus maintaining a rolling 15
month transaction table.
We have a "large" table which contains 36 months worth of data and daily we need to DELETE and re-INSERT the MOST RECENT 3 months. The approach
we use is to populate an empty temp table with the "up to date" 3 months worth of data, then to populate a second empty table containing a full 36
months worth of data using a multi-statement insert from the "3 month" temp table combined with the previous 33 months worth of data from the
production table. We then do a full delete of the production table and re populate from the second temp table. This process avoids any use of the
transient journal and has served us well (luckily we have the permspace to use this approach and just about have the batch window).
Now, we recently introduced PPI (months) on the date column of the table and, as well as providing great benefits on the query side of things,
we were hoping this would cut down significantly on the batch process described above, as we hoped to be able to DELETE (drop) the most recent 3
months partitions directly from the production table then re-create these with the up to date data. Our testing has shown that we can pretty much
avoid transient journal use during the DELETE phase but when we Re-Insert the 3 recent months data the transient journal is used with the expected
degradation in performance and the risk that a rollback is initiated. We were hoping that, despite the fact that the table already contains data,
the system would recognise that the 3 latest partitions were empty and therefore wouldn't journal during the Insert.
Does anyone know if this is feasible / possible or indeed planned for future releases. We are currently on V2R5.0
Many thanks
Howard
|