Manipulate string spaces [From Beverly White: Fri, 12 Mar 2004 @ 22:34 GMT]

Message Posted: Fri, 12 Mar 2004 @ 22:34:38 GMT

Subj:

Manipulate string spaces

From:

Beverly White

I have a large table containing address data - some of the addresses have been incorrectly entered with multiple spaces between words.

I am working through the process of cleaning the data and I am stumped at how to locate the second space in a string when 2 (or more) spaces are side by side.

In the following code fragment I have concatenated the two street address lines into one column and set indexes based on the first space found after each character string.

CREATE TABLE mytab.address as (Select DISTINCT (LN_1_STR_ADD || ' ' || LN_2_STR_ADD) AS STR_ADDRESS, index (Str_Address,' ') as Index_1, (character_length (str_address) - index_1) as rem_add_1, index (trim(substr(Str_Address, index_1+1)), ' ') + Index_1 As Index_2, (character_length (str_address) - index_2) as rem_add_2, index (trim(substr(Str_Address, index_2+1)), ' ') + Index_2 As Index_3, (character_length (str_address) - index_3) as rem_add_3, index (trim(substr(Str_Address, index_3+1)), ' ') + Index_3 As Index_4, (character_length (str_address) - index_4) as rem_add_4, index (trim(substr(Str_Address, index_4+1)), ' ') + Index_4 As Index_5, (character_length (str_address) - index_5) as rem_add_5, index (trim(substr(Str_Address, index_5+1)), ' ') + Index_5 As Index_6, (character_length (str_address) - index_6) as rem_add_6, index (trim(substr(Str_Address, index_6+1)), ' ') + Index_6 As Index_7, (character_length (str_address) - index_7) as rem_add_7, index (trim(substr(Str_Address, index_7+1)), ' ') + Index_7 As Index_8, (character_length (str_address) - index_8) as rem_add_8 from mastertab.address_line) with data;

This code works great except when more than one space separates the words in the string. When that happens the value of index 1 repeats through the rest of the indexes. I can't figure out why the next index doesn't start counting at index_1+1 as it should. In the rows where there is only one space between words the index values and the remaining character length values I am calculating are correct.

Can someone explain this to me (and perhaps offer a solution)?

Thanks

Attachments

Library

Quick Reference

Archives of the TeradataForum

Message Posted: Fri, 12 Mar 2004 @ 22:34:38 GMT