|
Archives of the TeradataForumMessage Posted: Wed, 17 Nov 2010 @ 21:20:08 GMT
Faible Mou wrote:
Have a look a the Database Design manual, the only proprietary stuff is the actual hashing fuction. SECTION 3 Physical Database Design, Part 1: Indexing The basic concept is quite simple, at least compared to the complexity of the optimizer stuff :-)
It's not actual hashing, it's using a hash algorithm to create a hash value, which is used as a kind of surrogate key. The big advantage: It's always 4 bytes regardless of the actual data size of the PI. Of course md5 hashes "better" and has less collisions, TD's hashing is like an advanced CRC32 (it's only 4 bytes). As collitions can't be avoided TD adds a second 4 byte value (which is actually a sequence value per hash value), the combination "row hash" plus "uniqueness value" results in a 8 byte unique id for each row in a table. And this RowID is used for distribution across AMPs and sortung within AMPs. Dieter
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright 2016 - All Rights Reserved | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
Last Modified: 15 Jun 2023 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||