Archives of the TeradataForum

Message Posted: Wed, 04 Aug 2004 @ 07:22:26 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Sampling of Pages

From:		Stephan Ewen

Hi,

I am not quite sure about the terminology, maybe pages is the wrong term, I am quite new to teradata, I come from a DB2 background... With pages I meant a unit that is read from the disk in one I/O operation and usually maps to one sector on the disk. Large tables naturally fill many pages and to get a statistically valid sample, the sampling would have to consider more or less all pages, since data might be clustered on the pages, that is why obtaining samples that should be statistically valid (i.e. robust to chi-square and spectral tests) is not so very performant. A great reductuion of I/Os is possible if you sample the pages, but the sample obtained is not always representative. Does this correspond to proportional sampling ? With the AMPs it sounds like query parallelism to me, rather than system or physical level.

Please correct me, if I am wrong, as I mentioned, I am new ;)

Thanks,

Stephan