Home Page for the TeradataForum

Archives of the TeradataForum

Message Posted: Fri, 07 Sep 2001 @ 10:33:51 GMT

  <Prev Next>   <<First <Prev Next> Last>>  

Subj:   Re: Data in DWH --- Issues
From:   David Wellman

Hi Amer,

Who comes up with a particular solution is again largely site/application dependent. It is probably in your vendor's best interest to help you with this subject, even if what they do is to provide an architecture for solving data quality issues (it is unlikely that they will know your source systems well enough to be able to give a detailed design to solve it).

I'm not sure about your final comment "And at the end if DWH doesn't show result then BLAME can't be on applications". If I've understood this statement correctly, then I don't agree.

The DWH can only provide results based on the data that it contains. If the data is 'dirty' then that may affect the results produced by queries -- joins may fail, GROUP BY clauses may not combine the detail rows correctly etc etc. The cause of the dirty data is typically down to the source system. Having ackowledged that you have dirty data to deal with, a decision has to be taken as to how to handle it. Until it is handled (i.e. cleaned) then your dwh will never produce the quality or accuracy of results that you are expecting from it.

Whether the fault is with the dwh or the source system(s) is (I think) largely irrelevant. What needs to happen is that it needs to be fixed, and before that can happen someone ('management') needs to take the decision to do it. That requires someone ('management') to commit to spending some money. If you think that your vendor should be doing this then (unfortunately) you have to go back to the contract. According to your contract;

- who is responsible for ensuring that the dwh data is clean?

- who is responsible for cleaning dirty data?

If the contract doesn't contain anything about responsibility for cleaning or transforming data (the two are subtely different but very similar) then that's a gap in the contract. Anyone who claims to be experienced in building a dwh should know that there are ALWAYS data issues, however clean the source system data is meant to be, and this area takes the largest amount of man-time to sort out.

I hope that is useful.



  <Prev Next>   <<First <Prev Next> Last>>  
  Top Home Privacy Feedback  
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023