Archives of the TeradataForum

Message Posted: Fri, 26 May 2000 @ 19:12:16 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Surrogate keys

From:		Dieter N�th

The hash algorithm is unchanged (there's a new version for Unicode now) but the creation of the rowid changed.

Before R3 the uniqueness value assigned to a a new row was the lowest unused value for that row hash:

Aassume five synonym/duplicate records with uniqueness values from 1 to 5. If record 2 is deleted and a new record inserted it will get value 2 again. So, if you insert record 231 the system has to check records 1 to 230 (probably on multiple datablocks) to find an unused value.

In R3 they changed that algorithm to a new one with much less overhead: The value assigned to a new record will be max(value) + 1, so the value is just increased. The uniqueness value is an 4 byte value, maybe there might be an overflow if you insert/delete 2^31 / 2^32 times (Probably it's just switching back to the old algorithm).

So now you're able to store more records per value but there's still that duplicate row check on set tables during MLoads. Using Multisets is skipping that, too, so you end up with faster inserts, but updates using the PK may be slower, because there are more datablocks to search for the right record.

BTW, if you want to see the new algorithm work, create a small test table and use the ROWID function to display row hash and uniqueness value (low /high bytes are switched, just compare it to HashRow (PI)), but never try to update the rowid, you're able to do that, but you're crashing the system (just tried that on a demo version ;-))))

Dieter N�th


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference