Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Sun, 28 Apr 2003 @ 03:51:08 GMT


     
  <Prev Next>   <<First <Prev
Next>
Last>>
 


Subj:   Re: Select * from mytable
 
From:   Walter, Todd A

Rank and Sample were both added to Teradata in V2R3.

In V2R4.0, both got performance enhancements.

SAMPLE got special support in the file system to allow sampling at the file system level rather than actually doing a database read of every row. While TD still has to access a fair portion of the table to get a statistically respectable sample, it does so as efficiently as possible. How much of the source has to be read depends on the size of the sample of course.

RANK with Qualify got a significant performance enhancement. When RANK is done with a less than or greater than Qualify against the rank (first N or last N), TD throws away entries that cannot make the N qualification as early as possible. While the source still has to be created (joins, aggregations,..) and then scanned to find the right rows, the pass over that source cuts the size as the pass is executed, minimizing the spool and processing of the ranking operation.

We have heard from many of you that there is a need for a quick facility to get N rows from a table. This facility has been put on the priority list for implementation.

As currently envisioned, this implementation would favor efficiency and performance over repeatability. Ie we would not be providing an ordered return or guaranteeing that repeating the request would get the same N rows. The most frequent request we have is for a very efficient "browse" function.

I note in some of these threads that people suggest that they wish to use a "first N" facility and then return for "next N" with another query. As currently envisioned, the implementation would not address that requirement because that would require repeatability of the entire ordering of the source set, a far more resource intensive operation. If the desire is to retrieve N at a time, it is better to run the whole query once, then process N at a time out of the result; much better on the database execution side as well.

Feedback on this topic welcome before this implementation becomes concrete...



     
  <Prev Next>   <<First <Prev
Next>
Last>>
 
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023