Home Page for the TeradataForum

Archives of the TeradataForum

Message Posted: Fri, 12 Mar 2004 @ 22:34:38 GMT

  <Prev Next>  
Next> Last>>  

Subj:   Manipulate string spaces
From:   Beverly White

I have a large table containing address data - some of the addresses have been incorrectly entered with multiple spaces between words.

I am working through the process of cleaning the data and I am stumped at how to locate the second space in a string when 2 (or more) spaces are side by side.

In the following code fragment I have concatenated the two street address lines into one column and set indexes based on the first space found after each character string.

CREATE TABLE mytab.address as

      index (Str_Address,' ') as Index_1,
      (character_length (str_address) - index_1) as rem_add_1,
      index (trim(substr(Str_Address, index_1+1)), ' ') + Index_1 As Index_2,
      (character_length (str_address) - index_2) as rem_add_2,
      index (trim(substr(Str_Address, index_2+1)), ' ') + Index_2 As Index_3,
      (character_length (str_address) - index_3) as rem_add_3,
      index (trim(substr(Str_Address, index_3+1)), ' ') + Index_3 As Index_4,
      (character_length (str_address) - index_4) as rem_add_4,
      index (trim(substr(Str_Address, index_4+1)), ' ') + Index_4 As Index_5,
      (character_length (str_address) - index_5) as rem_add_5,
      index (trim(substr(Str_Address, index_5+1)), ' ') + Index_5 As Index_6,
      (character_length (str_address) - index_6) as rem_add_6,
      index (trim(substr(Str_Address, index_6+1)), ' ') + Index_6 As Index_7,
      (character_length (str_address) - index_7) as rem_add_7,
      index (trim(substr(Str_Address, index_7+1)), ' ') + Index_7 As Index_8,
      (character_length (str_address) - index_8) as rem_add_8

from mastertab.address_line) with data;

This code works great except when more than one space separates the words in the string. When that happens the value of index 1 repeats through the rest of the indexes. I can't figure out why the next index doesn't start counting at index_1+1 as it should. In the rows where there is only one space between words the index values and the remaining character length values I am calculating are correct.

Can someone explain this to me (and perhaps offer a solution)?


  <Prev Next>  
Next> Last>>  
  Top Home Privacy Feedback  
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023