Archives of the TeradataForum
Message Posted: Tue, 24 Feb 2004 @ 13:56:30 GMT
The difference b/w the SELECT and the INS/SELECT with the 10million rows, is that there is a redistribution (of the 10million rows) that takes place after the SELECT based on the primary index of the target table - thus 10hrs!
You could try the following if they could help.
1. The elimination of duplicate rows through the use of DISTINCT - which otherwise will be done implicitly since you have a SET table.
2. Since the SELECT takes 10min you could try to EXPORT these rows (again DISTINCT rows) to a file and then IMPORT and INSERT from that file to your target table. By doing so a direct hash distribution of the input rows would take place and this may avoid inter-AMP traffic due to hash redistribution.
3. If you could rewrite the query in such a way that would nullify the redistribution of the rows i.e. to say when you perform the select operation a spool is being built on all amps (or group amps) -now if these rows hash value are the same as the resultant hash of the PRIMARY index of the target table - would eliminate the movement of the rows between AMPs due to redistribution, because the redistributed hash value would be pointing to the same AMP which contains the SPOOL rows - HOPE THAT IS NOT CONFUSING.
|Copyright 2016 - All Rights Reserved|
|Last Modified: 23 Jun 2019|