Archives of the TeradataForum

Message Posted: Mon, 12 Feb 2007 @ 09:50:47 GMT


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Subj:		Re: Accent Insensitive Matches

From:		Mario Galante

Your situation is not unusual but the phrase "I am an end-user and cannot make decisions regarding new installations." strikes me as too pessimistic. The database is for end-users so they should decide what they want to do with it, no?

Victor, I totally agree with you. I didn't express myself correctly. I didn't mean to say that my support group would not be receptive to my suggestions. It's just that the kind of solution proposed by Etienne would not be feasible in my situation without spending time presenting and (probably) defending it, quite possibly to several people in IT. Since I'm a contractor, I don't have this kind of time in my hands: I have to work with the tools available, focus on the most important problem and essentially obtain results rapidly. Dealing with accented characters is only one of several challenges I have to overcome.

That said, I will definitely escalate the issue, without holding my breath. To meet my deadline, I might however need to use an alternate solution such as spooling out all records and needed fields, and then use an available client-side solution such as SAS to finish processing.

> Are there any other solutions not involving admin changes
> to the Teradata server?

Yes. But before I describe them I'd like to mention that you can customize collation rules and define your own collation. Once you choose it for your session you can effortlessly compare strings according to your rules. I'd say it is the most natural solution in this situation but it does require some admin effort. Perhaps more effort than with UDF as customized collations are exotic.

This looks like an interesting alternative, but I'm not quite sure that I understand what you mean. The multinational collation already groups accented characters together, but I guess that internally these characters still have a distinct rank (they are just ordered differently than in, say, the ASCII collation).

Are you suggesting to create my own collation but assign the same ranks to each accented character? Is this possible? Even if it is, would it be possible then to change the collation depending on the query run? For example, my own collation would be useful when I compare strings, but the multinational collation would be better when I want to order these strings.

In other words, when I compare strings, it's okay for "e acute" to be equal to "e", but when I order them, I definitely want "e acute" to always come before or after "e".

Non-admin method would be to use the brute force and convert your data character-by-character and store it somewhere for further analysis. SUBSTRING and CASE should do the business if you give them enough time. Not elegant but doable.

This is kind of scary! :) It would involve writing an extremely complex set of queries that could take forever to run. I guess that spooling out data and finish with a client-side process is, at this point, the only practical solution in my situation. However, I will keep you posted. Maybe my support group will respond in a timely manner and allow me to use my own UDFs or collations so I can use a more efficient server-side solution.

Thanks for your input!


	<Prev	Next>		<<First	<Prev	Next>	Last>>

Attachments

Library

Quick Reference