Archives of the TeradataForum

Message Posted: Mon, 19 Jan 2004 @ 15:20:52 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: V2R5 Statistics

From:		Victor Sokovin

Craig,

Mathematically these are two quite different things. Let me explain the difference using the simplest example.

Table T had two columns, A and B, each of which can only take values 0 and 1. T has 100 rows, and 0 and 1 are distributed evenly in A and B, i.e., there exactly 50 0 and 50 1 in both A and B.

Now, take all possible values of the pair (A,B):

(0,0)
(0,1)
(1,0)
(1,1)

I can have quite different tables of this kind:

T1 has only (0,0) and (1,1), they occur 50 times each;

T2 has all 4 combinations (0,0), (0,1), (1,0), (1,1), 25 times each.

Obviously, T1 and T2 are quite different tables but their one-dimensional projections are the same (50 0s and 50 1s in 100 rows). When you collect multi-column statistics you should get more accurate (multidimensional) information (at least let's hope so!). Single-column statistics describe projections only and they might be missing something important.

Regards,

Victor