Archives of the TeradataForum

Message Posted: Tue, 21 Nov 2006 @ 14:01:23 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: UTF-8, Fastload and linux

From:		Victor Sokovin

Just thought I'd close this thread off by reporting back on what we found and what our conclusions on the subject are.

Dave, thank you for taking the time and posting the summary.

Certainly NCR confirmed that we can't use Fixed width definitions for UTF-8 in Fastload, but at least I can also say that it's not possible in Oracle SqlLoader either (thank goodness for that).

If you ask me I consider it a good news already that we can use CSV UTF-8 data with FastLoad (Linux/Unix client OS?), and I believe this is something you do confirm, right?

I have a few remarks, though. 40 thousand records sounds low to make any comparisons. As you confirmed, the difference between fixed-length and variable-length formats is small in your case but the difference could be more noticeable for others. I think the suggestion to look at the fixed- length format is still valid in general.

I am not sure about your SQL*Loader remark. Oracle make the following recommendation for SQL*Loader:

"In general, loading shift-sensitive character data can be much slower than loading simple ASCII or EBCDIC data. The fastest way to load shift- sensitive character data is to use fixed-position fields without delimiters."

As they started this discussion many years ago they are way ahead in providing solutions for this philosophical problem of loading fixed- position variable-byte data. Oracle make clear distinction in their data type definitions regarding the fixed number of bytes or characters, and they support a wide range of char sets, including UTF-8. If I have time, I'll try to load fixed-position UTF-8 data into an Oracle table to see what happens.

Regards,

Victor


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference