Re: SQL to Normalize Columns to Rows [From Stegelmann, Rolf: Thu, 08 Jun 2006 @ 09:47 GMT]

Message Posted: Thu, 08 Jun 2006 @ 09:47:10 GMT

Subj:

Re: SQL to Normalize Columns to Rows

From:

Stegelmann, Rolf

If are running on 6.0 you can write a table function which is a special case of a UDF that produces a table on output. It would go something like this:

CREATE FUNCTION expand(INTEGER, DECIMAL(9,2), DECIMAL(9,2),...) RETURNS TABLE (Customer_Number INTEGER, Type_ID SMALLINT, Sales_Amount DECIMAL(9,2) ) LANGUAGE C NO SQL PARAMETER STYLE SQL EXTERNAL...;

A UDF can only have 128 input parameters so you can only expand part of the columns for each pass. Assuming the UDF was set up to handle 125 columns at a time the query would look as follows:

INSERT Customer_Number, Type_ID, Sales_Amount INTO T3 SELECT Customer_Number, Type_ID, Sales_Amount FROM TABLE( expand(t1.Customer_number, t1.col001,... t1.col125)) AS T2 ;INSERT Customer_Number, Type_ID, Sales_Amount INTO T3 SELECT Customer_Number, Type_ID, Sales_Amount FROM TABLE( expand(t1.Customer_number, t1.col126, ... t1.col250)) AS T2;

By making it a multi-statement request if T3 is empty to begin with it will only generate one transient journal row.

The UDF itself is presented with the set of columns for each row. It would test each input parameter for NULL value and if not NULL it would build the required output row (fill out column parameters). It would do this for each "column" parameter. The whole process is a little too involving to explain here in detail but basically it goes something like this:

void expand (INTEGER *In_Customer_Number, DECIMAL2 *Col1, .... ) { switch (phase) case init: i = 0; /* save in table function scratch mem */ case build: for (j = i; j

The Table UDF is called repeatedly by the database for each output row you build. Each time the same input data for the current input row is presented until the UDF sets the EOF return. At that point the database will call the UDF again with the next input row. In this case the init logic is executed again. The database logic will always specify the current phase on each UDF iteration.

Of course you need to know how to program in C. Once the UDF is debugged run it in non-protected mode and it should be very fast.

This is all done in parallel where a copy of the UDF is run on all AMPs.


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference

Archives of the TeradataForum

Message Posted: Thu, 08 Jun 2006 @ 09:47:10 GMT