Archives of the TeradataForum

Message Posted: Fri, 09 Dec 2005 @ 10:26:19 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: SKEWING Issue

From:		McCall, Glenn David

At a high level:

Fallback is keeping another copy of the data. Fallback is implemented in the database itself - unlike RAID-0 which is implemented in the disk subsystem. So if you had RAID-0 and fallback, you would have 4 copies of the data.

Acquisition phases is a term that the load utilities (eg multi-load) use. It means that the utility is acquiring data and staging it in the database. It then will go to an Apply phase which means the utility is applying the data to the target tables.

Skewing is where one AMP (a thing that manages data) has more than its fair share - as a result has to do more than its fair share of work. The most common cause of skewing is poor index selection, but it can occur when a query is running.

Example of skewing (and null effect) - consider a paper based system that stores names of people. There are 27 buckets A-Z and "name unknown". If there are 27 people looking after the buckets and you tell the Z person to go and retrieve names that start with "Zaphod " then he/she will respond quickly because there aren't many Z's to scan through. On the other hand if you asked the A person to look for people with names starting with "Arthur", the response time will be somewhat slower because there will be many more A's to scan through.

If you ask the "system" to look for people with names starting with "Arthur" or "Zaphod", the response time will be determined by the time it takes the "A" guy to do his scan. Despite the fact that the "Z" guy will be sitting around doing nothing because his scan was real quick.

This gives rise to another term "Hot-AMP" which basically means that one AMP (the "A" guy) is doing all the work, while the rest of them are twiddling there thumbs.

As for nulls, use the above example. If you have a lot of "name unknown" people (i.e. a null name) then you will have skewing to the null values. If you have very few "name unknown's", then you will not. So the answer is it depends upon how many nulls you have.

Note that the skewing is determined by the uniqueness of all of the columns in the Primary Index - not just the one that has nulls.

I hope this helps

Glenn Mc


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference