Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Thu, 09 Oct 2003 @ 20:29:30 GMT


     
  <Prev Next>   <<First <Prev
Next>
Last>>
 


Subj:   Re: Guru needed: OLAP function quandary
 
From:   Dieter Noeth

Jeremy Christiansen wrote:

  I'm having problems with OLAP functions again. I've written to the list before about this issue but never got anywhere. It seems that no one really knows what goes on inside Teradata when it comes to exactly how OLAP functions accomplish their tasks. Maybe a Teradata guru out there could provide some insight. (Hint, Hint.) Here is the problem:  


  OLAP FUNCTIONS USE WAY TOO MUCH SPOOL!  



I'm not a guru and i don't know about the internal processing of OLAP functions, but here's what i guess (based on lots of tests):

There's always two spools within a STAT FUNCTION step (I think this is somehow similar to the way FastExport is ordering a result set using two vertical/horizontal redistributions).

In the first spool the data is sorted according to partition/order and during the second distribution the values are calculated.

So for two spools you'll need twice the number of bytes (15GB volatile table plus 2 * 15GB spool = 45).

The columns used for partition/order are probably kept in spool similar to Teradata's tag sort, so again you need more spool.


  Select a.*, csum(1,primary_index,more_fields)  


will result in a spool row like that:

a.*, 1, primary_index,more_fields

So if you have all columns in your partition/order this will result in a spool row twice as big as before. And there's probably some bytes overhead per spool row.

In worst case this is close to 5 times the original size carried in spool.

Just my two Euro cents...


Dieter



     
  <Prev Next>   <<First <Prev
Next>
Last>>
 
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023