Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Fri, 04 Jan 2008 @ 22:29:54 GMT


     
  <Prev Next>   <<First <Prev
Next>
Last>>
 


Subj:   Re: Model Timestamp or (DATE and TIME)
 
From:   Victor Sokovin

  I need to capture timestamp information - call this column x.  


  Should I model it as TIMESTAMP(n) or use 2 columns : DATE and TIME(n)  


  What are the pros and cons?  


This question and and the rest of the posting is a very good set of questions.

I think it would be fair to say that they are constantly chasing all of us when we design databases and the answers tend to vary according to the type of database we design and to the time we live in and design those databases. There were times when timestamps did not exist as data types and there were times when partitioning did not exist and then partitioning was introduces but it was limited, then it became less limited etc etc etc. So, I am afraid I won't be able to provide definitive answers but I can resist leaving a few comments.


  Here are some usage considerations :  


  Only the date part, x (date), may be used in joins between tables.  


This consideration does not seem very important to me. Why? Because I don't see many joins of this type for *large* fact tables (the topic is not really interesting for small tables, right?). Why would we join two large fact tables on the date part, to create nasty Cartesian joins? If you join a fact table to some small reference table (to see whether the date is the third Tuesday of the month or something) then I would not bother and I would not let this type of joins dictate the database design considerations.


  This column will be used in WHERE clauses. The WHERE clause can be in either format  


  1) WHERE x (date) between 2 date literals  


  2) WHERE x (timestamp) between 2 timestamp literals  


This I will probably pass on for now but may return later. Maybe my next comment will take care about this one anyway.


  This column will also be used in PPI (with intervals of 1 day or 1 month).  


I suggest you invest more time in the analysis of this one. To save me time I will just quote a very good thread on this subject:

www.teradataforum.com/teradata/20060221_191231.htm

I think it explains very well why you should not use timestamps in PI (and PPI).

PI is the best method of accessing data in Teradata so PI should be designed in such a way that it fetches the data for the most used business queries. There are always exceptions of course but in general queries with specific timestamps in WHERE clauses or join conditions are not the most valuable queries.

Despite the simplicity of the arguments in the thread I will add that I had a few chances to come across professional NCR consultants (not to mention many other groups of non-NCR consultants, professional or not) recommending the use of timestamps or even randomly generated urrogate keys in PI. Their argument was that timestamps will provide perfect distribution of data and everything will fly. Usually it is indeed the case in the beginning of the database but when the database starts growing it becomes evident that a perfect data distribution does not make a perfect database. By that time consultants are usually gone to their next client so they never have a chance to learn.

I'll just add that I personally don't like dates in PI either, especially when there is an option of partitioning on the date, because there is usually a more important combination of business keys. Not many businesses are about dates, they are about customers, products, orders, cash flows etc.


  Is there any case when TIMESTAMP(n) will perform better than DATE+TIME(n)? I see none. Irrespective of usage patterns, I am leaning towards always modeling it as DATE+TIME(n), and in the view I would join these 2 to expose as TIMESTAMP(n) (in addition to DATE and TIME(0)).  


My concern with TIME(n) is that a TIME(n)-column by itself is useless. What can you do with "times" alone? They are always connected to their dates? Whenever you calculate intervals or compare records you'll have to use both columns instead of one. Is this convenient for users? I don't think so, and users are always right. Views you described can help but one thing about views I learned is that views will always let you down when you most need them. I would never rely on views in the foundations of the database design.

I think Oracle never even offered TIME(n). Is this correct? I don't want to go check this now. Hopefully, memory serves me on this occasion. I am pretty sure that I've never seen TIME(n) columns in any Oracle database I ever used. And Oracle never seemed to have any problems without that data type. Something to think about.


Regards,

Victor



     
  <Prev Next>   <<First <Prev
Next>
Last>>
 
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 27 Dec 2016