Archives of the TeradataForum
Message Posted: Fri, 11 Apr 2003 @ 20:15:14 GMT
Two examples of Parallelism and Sequence:
1. Without Parallelism
Think of a massive Process (a process that takes 2 hours to run and it runs daily).
Whenever the process begins, it enters an entry in the Log Table. The Teradata system assigns a unique sequence number to identify this process. At the end of the Process, the Log table is updated with the relevant statistics and metadata. In this case, there is only one entry in the log table for the whole process that runs in one day. There can be a child table to the main log table that can keep track of the component or sub-processes. The child table will use the same sequence as used by the main log table. The log tables can be used to observe the trend and behaviour of the process and system over a time. In this case, parallelism happens within the process and not for the log entries.
2. With Parallelism
Think of a massive file with billion records and many fields as a source feed. Let us assume that our datamodel requires the file needs to be split vertically(fieldwise) into three or four tables. Let us also assume that there is no inherent data-relationship within the data to split-up and again join to get the original spreadout. In this case, we can load the file into a staging table and Teradata will assign a unique sequence number to each record (Staging Table will have an additional column for sequence number which will be populated by Teradata). We can then easily split the staging table vertically and can put the rows into the three or four tables. Of cource, we have the overhead of the sequence number column in each of the three or four tables. But here, parallelism works along with the sequence number.
|Copyright 2016 - All Rights Reserved|
|Last Modified: 27 Dec 2016|