4 Functions to Remove BlanksJanuary 29, 2008
1. Remove leading and trailing blanks using the STRIP function
Understand the old SAS programs and the TRIM and LEFT functions: until SAS 8.2 version, to remove leading blanks from a string, we would have first more them to the end of the string using the LEFT function and remove those trailing blanks with the TRIM function. This has several disadvantages:
- Two functions had to be used (TRIM and LEFT),
- The order the functions appears had to be respected. Otherwise the trialing blanks would remain.
- When this notation was used with am empty string, the result was a string of length 1 instead of 0. The TRIMN function which solves this issue only appeared with SAS 9.
Making the most of the new syntax, the STRIP function: since SAS 9, the STRIP function resolves the three issues. This function is equivalent to TRIMN(LEFT()).
2. Keep an unique blank between words using the COMPL function: usually a single blank between words of your strings is needed. To remove any unecessary blanks, you can remove the COMPL function. If there are many trailing blanks, only one will remain.
3. Remove all the blanks using the COMPRESS function: by default the COMPRESS function remove all the blanks from a string. To remove any other type of character, you can add a second parameter. But this is beyond today’s topic.
4. Remove les x first blanks using RXPARSE et CALL RXCHANGE statements: the following notation can remplace any tect by another one. Here I shall restrict the discussion to blanks. First you have to define the value before and after in RXPARSE function. Then you define the three or even four parameters of the CALL RXCHANGE statement i.e.:
- recall the definition of RXPARSE,
- precise teh variable to update
- specify the number of time to conduct the update.
- by default the original variable is updated. To create a new variable, a fourth parameter to CALL RXCHANGE can be added. The length of the new variable will be 200 unless it has been specifically defined beforehands.
Tip: Given that it cannot be more changes than characters in the string, the length of the string can be used to define the number of repetitions. This length, when it inlcude trailing blanks and resolve to 0 when no character is available can be found using the LENGTHC function.
Example: here is a variable y with length 30, which is set based on the x variable without its six blanks.
length y $30;
x= ‘ ZZZ ABCD AB ‘ ;
rx=rxparse(” ‘ ‘ to “);
call rxchange (rx,lengthc(x),x,y);