|
|
Archives of the TeradataForum
Message Posted: Fri, 25 Apr 2003 @ 12:13:24 GMT
Subj: | | Idea for the implementation of a view needed |
|
From: | | Anomy Anom |
<-- Anonymously Posted: Friday, April 25, 2003 08:06 -->
Hello,
we're looking for a good idea for the implementation of the security level. There are two tables in our DWH: member_activity and
customer_restricted. Both tables have cust_id as NUPI with a good distribution.
The problem is to create a quick view, that supply rows from the table member_activity, but doesn't show customer_id for customers from
the table customer_restricted.
The aggrigations, which access this view, must still work properly. Our first idea was to create such a view:
REPLACE VIEW member_activity_restricted AS
SELECT
a.activity,
CASE WHEN b.cust_id IS NOT NULL THEN 0 ELSE a.cust_id END cust_id
FROM
member_activity a
LEFT JOIN
customer_restricted b
ON a.cust_id = b.cust_id
;
But this view doesn't make PI-Access!
For example:
sel * from member_activity_restricted
where cust_id = 222333
;
Explanation -------------------------------------------------- | |
| 1) | First, we lock a distinct db_scratch."pseudo table" for read on a RowHash to prevent global deadlock for db_scratch.a.
| |
| 2) | Next, we lock a distinct db_scratch."pseudo table" for read on a RowHash to prevent global deadlock for db_scratch.b.
| |
| 3) | We lock db_scratch.a for read, and we lock db_scratch.b for read.
| |
| 4) | We do an all-AMPs JOIN step from db_scratch.b by way of a RowHash match scan with no residual conditions, which is joined to
db_scratch.a. db_scratch.b and db_scratch.a are right outer joined using a merge join, with a join condition of ("db_scratch.a.CUST_ID =
db_scratch.b.CUST_ID"). The input table db_scratch.a will not be cached in memory. The result goes into Spool 2, which is built locally on
the AMPs. The result spool file will not be cached in memory. The size of Spool 2 is estimated with low confidence to be 163,805,210 rows.
The estimated time for this step is 2.90 seconds.
| |
| 5) | We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by way of an all-rows scan with a condition of ("(CASE WHEN NOT (CUST_ID IS
NULL )THEN (0) ELSE (CUST_ID) END )= 222333") into Spool 1, which is built locally on the AMPs. The result spool file will not be cached in
memory. The size of Spool 1 is estimated with low confidence to be 163,805,210 rows. The estimated time for this step is 6 minutes and 19
seconds.
| |
| 6) | Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
| -> | The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 6 minutes and 21
seconds.
| |
The next try was this kind of view:
REPLACE VIEW member_activity_restricted
AS
SELECT
a.activity,
cust_id
FROM member_activity a
WHERE cust_id NOT IN
(SELECT cust_id FROM customer_restricted)
UNION ALL
SELECT
a.activity,
0 AS cust_id
FROM member_activity a
WHERE cust_id IN
(SELECT cust_id FROM customer_restricted)
;
It was already better. For the same example we had a PI-Access:
sel * from member_activity_restricted
where cust_id = 222333
;
Explanation -------------------------------------------------- | |
| 1) | First, we lock a distinct db_scratch."pseudo table" for read on a RowHash to prevent global deadlock for db_scratch.a.
| |
| 2) | Next, we lock a distinct db_scratch."pseudo table" for read on a RowHash to prevent global deadlock for
db_scratch.customer_restricted_st.
| |
| 3) | We lock db_scratch.a for read, and we lock db_scratch.customer_restricted_st for read.
| |
| 4) | We do an all-AMPs RETRIEVE step from db_scratch.customer_restricted_st by way of an all-rows scan with no residual conditions into
Spool 3, which is built locally on the AMPs. Then we do a SORT to order Spool 3 by row hash and the sort key in spool field1 eliminating
duplicate rows. The size of Spool 3 is estimated to be 6,257 rows.
| |
| 5) | We do a SUM step to aggregate from Spool 3 by way of an all-rows scan. Aggregate Intermediate Results are computed globally, then
placed in Spool 4.
| |
| 6) | We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by way of an all-rows scan into Spool 2, which is duplicated on all AMPs.
| |
| 7) | We do a single-AMP JOIN step from db_scratch.a by way of the primary index "db_scratch.a.CUST_ID = 222333" with no residual
conditions, which is joined to Spool 3. db_scratch.a and Spool 3 are joined using an exclusion merge join, with a join condition of
("db_scratch.a.CUST_ID = Spool_3.CUST_ID"), and null value information in Spool 2. Skip this join step if null exists. The input table
db_scratch.a will not be cached in memory. The result goes into Spool 1, which is built locally on that AMP. The size of Spool 1 is
estimated with index join confidence to be 11 rows. The estimated time for this step is 0.21 seconds.
| |
| 8) | We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 6, which is built locally on the AMPs.
Then we do a SORT to order Spool 6 by row hash, and null value information in Spool 2. Skip this retrieve step if there is no null. The
size of Spool 6 is estimated to be 6,257 rows.
| |
| 9) | We do an all-AMPs JOIN step from db_scratch.a by way of the primary index "db_scratch.a.CUST_ID = 222333" with no residual
conditions, which is joined to Spool 6 (Last Use). db_scratch.a and Spool 6 are joined using an exclusion merge join, with a join condition
of ("db_scratch.a.CUST_ID = Spool_6.CUST_ID"), and null value information in Spool 2 (Last Use). Skip this join step if there is no null.
The input table db_scratch.a will not be cached in memory. The result goes into Spool 1, which is built locally on that AMP. The size of
Spool 1 is estimated with index join confidence to be 11 rows. The estimated time for this step is 0.21 seconds.
| |
| 10) | We do an all-AMPs JOIN step from db_scratch.a by way of an all-rows scan with a condition of ("0 = 222333"), which is joined to
db_scratch.customer_restricted_st. db_scratch.a and db_scratch.customer_restricted_st are joined using an inclusion merge join, with a join
condition of ("db_scratch.a.CUST_ID = db_scratch.customer_restricted_st.CUST_ID"). The input table db_scratch.a will not be cached in
memory. The result goes into Spool 1, which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 17,372
rows. The estimated time for this step is 12 minutes and 6 seconds.
| |
| 11) | We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by way of an all-rows scan with a condition of ("Spool_1.CUST_ID = 222333")
into Spool 7, which is built locally on the AMPs. The size of Spool 7 is estimated with no confidence to be 17,372 rows. The estimated
time for this step is 0.18 seconds.
12) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
| -> | The contents of Spool 7 are sent back to the user as the result of statement 1.
| |
But!!! it doesn't make a PI-Access in the join with another table (I think, because of the UNION in the View ):
For example:
sel * from
(sel 222333 cust_id) c
inner join
db_scratch.test_view_st v
on v.cust_id = c.cust_id
;
Explanation -------------------------------------------------- | |
| 1) | First, we lock a distinct db_scratch."pseudo table" for read on a RowHash to prevent global deadlock for db_scratch.a.
| |
| 2) | Next, we lock a distinct db_scratch."pseudo table" for read on a RowHash to prevent global deadlock for
db_scratch.customer_restricted_st.
| |
| 3) | We lock db_scratch.a for read, and we lock db_scratch.customer_restricted_st for read.
| |
| 4) | We do an INSERT into Spool 3.
| |
| 5) | We execute the following steps in parallel.
| |
| |
| 1) | We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 2, which is redistributed by hash
code to all AMPs. The size of Spool 2 is estimated with high confidence to be 1 row. The estimated time for this step is 0.17 seconds.
| | |
| |
| 2) | We do an all-AMPs RETRIEVE step from db_scratch.customer_restricted_st by way of an all-rows scan with no residual conditions into
Spool 6, which is built locally on the AMPs. Then we do a SORT to order Spool 6 by row hash and the sort key in spool field1 eliminating
duplicate rows. The size of Spool 6 is estimated to be 6,257 rows.
| | |
| 6) | We do a SUM step to aggregate from Spool 6 by way of an all-rows scan. Aggregate Intermediate Results are computed globally, then
placed in Spool 7.
| |
| 7) | We do an all-AMPs RETRIEVE step from Spool 7 (Last Use) by way of an all-rows scan into Spool 5, which is duplicated on all AMPs.
| |
| 8) | We do an all-AMPs JOIN step from db_scratch.a by way of an all-rows scan with no residual conditions, which is joined to Spool 6.
db_scratch.a and Spool 6 are joined using an exclusion merge join, with a join condition of ("db_scratch.a.CUST_ID = Spool_6.CUST_ID"), and
null value information in Spool 5. Skip this join step if null exists. The input table db_scratch.a will not be cached in memory. The
result goes into Spool 1, which is built locally on the AMPs. The result spool file will not be cached in memory. The size of Spool 1 is
estimated with index join confidence to be 163,662,236 rows. The estimated time for this step is 14 minutes and 36 seconds.
| |
| 9) | We do an all-AMPs RETRIEVE step from Spool 6 (Last Use) by way of an all-rows scan into Spool 9, which is built locally on the AMPs.
Then we do a SORT to order Spool 9 by row hash, and null value information in Spool 5. Skip this retrieve step if there is no null. The
size of Spool 9 is estimated to be 6,257 rows.
10) We do an all-AMPs JOIN step from db_scratch.a by way of an all-rows scan with no residual conditions, which is joined to Spool 9
(Last Use). db_scratch.a and Spool 9 are joined using an exclusion merge join, with a join condition of ("db_scratch.a.CUST_ID =
Spool_9.CUST_ID"), and null value information in Spool 5 (Last Use). Skip this join step if there is no null. The input table db_scratch.a
will not be cached in memory. The result goes into Spool 1, which is built locally on the AMPs. The result spool file will not be cached
in memory. The size of Spool 1 is estimated with index join confidence to be 163,662,236 rows. The estimated time for this step is 14
minutes and 36 seconds.
| |
| 11) | We execute the following steps in parallel.
| |
| |
| 1) | We do an all-AMPs JOIN step from db_scratch.a by way of an all-rows scan with no residual conditions, which is joined to
db_scratch.customer_restricted_st. db_scratch.a and db_scratch.customer_restricted_st are joined using an inclusion merge join, with a join
condition of ("db_scratch.a.CUST_ID = db_scratch.customer_restricted_st.CUST_ID"). The input table db_scratch.a will not be cached in
memory. The result goes into Spool 1, which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be
163,835,841 rows. The estimated time for this step is 12 minutes and 6 seconds.
| | |
| |
| 2) | We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by way of an all-rows scan into Spool 11, which is duplicated on all AMPs.
The size of Spool 11 is estimated with high confidence to be 64 rows. The estimated time for this step is 0.17 seconds.
| | |
| 12) | We do an all-AMPs JOIN step from Spool 11 (Last Use) by way of an all-rows scan, which is joined to Spool 1 (Last Use). Spool 11
and Spool 1 are joined using a product join, with a join condition of ("Spool_1.CUST_ID = Spool_11.CUST_ID"). The result goes into Spool
10, which is built locally on the AMPs. The size of Spool 10 is estimated with no confidence to be 102,399 to 173,605 rows. The estimated
time for this step is 1 minute and 49 seconds to 1 minute and 49 seconds.
| |
| 13) | Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
| |
| -> | The contents of Spool 10 are sent back to the user as the result of statement 1.
| |
Does anybody have another idea, how this view can be implemented?
Thanks in advance!
Slavik Taubkin
| |