Home Page for the TeradataForum

Archives of the TeradataForum

Message Posted: Mon, 12 Feb 2001 @ 11:45:41 GMT

  <Prev Next>   <<First <Prev Next> Last>>  

Subj:   Re: Data Warehouse's Data
From:   David Wellman

Hi Amer,

Big questions ! To try and put some ideas in place around this lot, I'd start with the following thoughts (and no doubt others will disagree and use their own experiences).

1. How long to keep data? This is driven primarily by the business needs and use of the data. If the business can derive value from data that's 20-years old (and that value pays the cost of keeping the data for 20 years !) then keep the data for 20 years. This however seems a little unlikely. Most customers that I work with keep their data for 2-3 years and at the end of that they simply purge the data (i.e. delete it such that they cannot retrieve it).

Some customers purge the detail data at the end of it's life, but keep a summarised version of it available. Yes, they know that they can't recreate the detail, but that policy meets their business requirements.

Always remember that the retention policy from a business perspective is likely to change and so any technical solution should be able to change. Given a sensible design, changing the retention period should mean just changing a value in a lookup/control table or file. Nothing more complex than that.

2. I've never heard of any customer who "never deleted data" from their warehouse - nice from a salesman's point of view if you find one !

3. Refreshing data completely from the source system. This tends to depend on the type of data, data volume and (again) the business requirements. It's unlikely that you'll refresh transaction data (cdr's or whatever in your case). The volume is too high to reload it all every time in the window allowed. Also, this type of data is the core data for which you want to maintain and build up history - that's where a lot of the value of a data warehouse comes from.

Small reference data is one data type that could possibly be completely refreshed every time (I know at least one site that does this). But think carefully about doing this. If you're keeping your transaction data for 3 years and users are running queries which join the transaction data to the reference data, then you possibly want to keep your reference data for 3 years - this probably isn't a data volume issue as this stuff tends to be relatively low volume but it will affect how you design and write the loading scripts.

I hope that's useful info.



  <Prev Next>   <<First <Prev Next> Last>>  
  Top Home Privacy Feedback  
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023