String manipulation in T-SQL has always felt somewhat cumbersome. That might be saying something from an old C programmer, but as I've used many modern languages, T-SQL seems to make string manipulation more difficult than it should be. However, that is slowly changing as more and more functions are being added to the language over time.
In this article, I want to examine the new TRIM() function that was added in SQL Server 2017 and see how it can be used to make your code a litle cleaner and easier to write.
TRIM() Syntax
One of the new functions in SQL Server 2017 is the TRIM() function. The documentation notes this is a string function that takes two parameters, but not as most other functions. If you look at the syntax, this shows:
In other words, I'd write something like
SELECT TRIM('m' from 'medium')
This will actually reutrn "ediu" as the output.
The syntax is that I want to use TRIM() as I'd use RTRIM() or LTRIM() around a string. However, if I need to remove something other than spaces, I can use a 'string' FROM as part of the parameter. I don't know why they made this decision instead of structuring things like this:
SELECT TRIM('string', 'expression')
That would fit with most other functions, but consistency has never been one of Microsoft's strengths.
Default TRIM
The default character to remove is the space string. This is ASCII CHAR(32), and it is often leading or trailing a string. Let's see how this works in practice. First, I can declare a few strings and then run TRIM() against them. I'll concatenate the results with brackets so that we can determine where spaces exist. These are my examples:
SELECT Result = '[' + TRIM(n) + ']' , FirstLetter = ASCII(SUBSTRING(TRIM(n), 1, 1)) FROM ( VALUES (' ') , (' ' ) , ('Steve ') , ('Steve Jones') , (' Steve Jones') , (' Steve Jones ') , ('Steve Jones a k a way0utwest ') ) AS a (n); GO
The results are:
As you can see, leading and trailing spaces are removed, but spaces in the middle of the string remain. This is the same behavior as RTRIM(LTRIM()), which is what we'd write prior to SQL 2017. We can see that here:
If this were all that TRIM did, it would be a small improvement, and a welcome one. TRIM() is much cleaner than RTRIM(LTRIM()) and easier to read.
Choosing What is Trimmed
TRIM() has more than just removing spaces. The syntax allowing for a set of characters to remove is an interesting extension to the function. Essentially, I can tell the system to remove a specific character, or set of characters. Let's see how this works in practice with a few examples.
First, let's look at simple trimming. One of the things people may do is get a list of items and separate them by a comma. This is an string of values that's easy to produce in T-SQL, though often this results in a comma at the end. If I had a string of values, perhaps some would have commas, and some wouldn't. Writing code to check for the comma and then remove it is cumbersome. TRIM() makes this easy. Here's some code:
SELECT Result = '[' + TRIM(',' FROM n) + ']' , FirstLetter = ASCII(SUBSTRING(TRIM('n' FROM n), 1, 1)), LastLetter = ASCII(SUBSTRING(TRIM(n), LEN(TRIM(',' FROM n)) - 1, 1)) FROM ( VALUES (',Broncos, Raiders, Chargers, Chiefs ') , (',Texans, Jaguars, Colts, Titans, ' ) , ('Steelers,Ravens,Browns,Bengals,') , (',Patriots,Dolphins,Bills,Jets') ) AS a (n); GO
If we look at the results, we see that all trailing commas were removed. We can alsoΒ see that the ending comma has been trimmed from rows 3 and 4, where the last character was a comma. None of the intermediate commas was taken out. This is as we expect. It's not a substitution, but a removal from the ends. In row 2, where a carriage return exists, the last comma isn't removed.
Now let's try something different. Let's start with these strings in my TVC:
VALUES (',Broncos, Raiders, Chargers, Chiefs ,') , (',Texans, Jaguars, Colts, Titans, ' ) , ('Steelers,Ravens,Browns,Bengals,') , (',Patriots,Dolphins,Bills,Jets ')
As you can see, there are commas and spaces at the end. Let's try to remove everything with this query:
SELECT Result = '[' + TRIM(', ' FROM n) + ']' , FirstLetter = ASCII(SUBSTRING(TRIM(', ' FROM n), 1, 1)), LastLetter = ASCII(SUBSTRING(TRIM(', ' FROM n), LEN(TRIM(', ' FROM n)) - 1, 1)) FROM ( VALUES (',Broncos, Raiders, Chargers, Chiefs ,') , (',Texans, Jaguars, Colts, Titans, ' ) , ('Steelers,Ravens,Browns,Bengals,') , (',Patriots,Dolphins,Bills,Jets ') ) AS a (n); GO
This gives me some interesting results. Let's look at these:
In row 1, I had a space and a comma at the end. Both are removed. In row 2, we had a comma then a space. Those were removed. The trailing comma in row 3 and the trailing space in row 4 were also removed. I now have an easy way to remove multiple characters from a string. In fact, I could do more. I'll add a few more characters to my TRIM expression.
SELECT Result = '[' + TRIM(', Patriots' FROM n) + ']' , FirstLetter = ASCII(SUBSTRING(TRIM(', ' FROM n), 1, 1)), LastLetter = ASCII(SUBSTRING(TRIM(', ' FROM n), LEN(TRIM(', ' FROM n)) - 1, 1)) FROM ( VALUES (',Broncos, Raiders, Chargers, Chiefs ,') , (',Texans, Jaguars, Colts, Titans, ' ) , ('Steelers,Ravens,Browns,Bengals,') , (',Patriots,Dolphins,Bills,Jets ') ) AS a (n); GO
Now I get these results:
I've completely removed the Patriots from row 4, which is what I wanted. My trailing commas and spaces were removed, but TRIM isn't really removing words. Instead, it's removing all the letters in p-a-t-r-i-o-t-s. In this case, the T from Texans and the St from Steelers were removed. The ts from Jets wer e also removed as trailing characters.
Collation Matters
If I switch to a case sensitive database, in this case, an AdventureWorks version, what happens? First, let's see the database options:
This is a restored database from SQL 2016, and the compatibility is set there. However, this is a case sensitive database. If I now run the same query that I last ran, I get:
Only the Patriots are removed π
Why? The 't' in the TRIM expression is lower case and the T in Texans is upper. Therefore, this isn't removed. The same thing for the S in Steelers. It's a different case than the expression. I'm guessing here that since the S in Steelers isnt' removed, the t is never checked, and we keep the first letters. The end trimming is the same.
Conclusion
TRIM() is a useful function that can help you remove leading and trailing sets of characters. This produces cleaner and easier to read code, though only on SQL Server 2017 instances. If you are on that version (or higher), you might experiment with TRIM() and add it to your toolbox.
One other note, this seems to work in the 2017 engine, regardless of compat level. My case sensitive database was at a 2016 level, but TRIM still worked, despite this not being valid in 2016. Be aware of that if you might have your own TRIM function from previous versions.