Coalesce and empty strings to replace NULL - need help

Question

Coalesce and empty strings to replace NULL - need help

movvap

Old Hand

Points: 300
More actions
February 1, 2016 at 12:08 pm

#304249

ello,
I have three different sample T-sqls that almost produce the same result except the sql that uses the 'coalesce'. Please see the below T-sqls:
SELECT custid ,
country ,
region ,
city ,
CASE WHEN region IS NULL THEN country + ' ' + city
WHEN region IS NOT NULL THEN country + ' ' + region + ' ' + city
END AS location
FROM sales.customers;
SELECT custid ,
country ,
region ,
city ,
country + ' ' + COALESCE(N'', region, N'') + city AS location
FROM sales.Customers;
SELECT custid ,
country ,
region ,
city ,
country + ' ' + ISNULL(region, '') + ' ' + city AS location
FROM sales.Customers;
Here the objective is that if the Region column is Null then just replace that with '' (empty string). Can I use any of the above or does it have any performance issues or which one would you use to accomplish the same result.
Not sure how 'coalesce' works but I do not see the region is being displayed when the value exists. It always puts an empty string. Can some one help me which is the best way to go about to replace the empty strings?
Thanks,
Dev

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

John Rowan SSC Guru Points: 56440 More actions · Answer 1

I prefer the middle example using COALESCE. The problem is that your COALESCE is mal-formed. You don't need the first argument. Change it to this and it should work...

country + ' ' + COALESCE(region, N'') + city AS location

John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

Ed Wagner SSC Guru Points: 287026 More actions · Answer 2

You said you weren't sure how the COALESCE function worked. Quite simply, it returns the first non-NULL expression in your list.

For further reading, it's fully covered in Books Online at https://msdn.microsoft.com/en-us/library/ms190349%28v=sql.110%29.aspx, including a comparison of it with ISNULL.

Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions

movvap Old Hand Points: 300 More actions · Answer 3

Thanks John and Ed!

I just ran the below query with out the N in the country + ' ' + coalesce(region, N' ') + ' '+ city as location

select custid, country, region, city

, country + ' ' + coalesce(region, ' ') + ' '+ city as location

from sales.Customers;

and I still get the correct result. I am also trying to understand the the below two select statments

select DATALENGTH(N'abc');

select DATALENGTH('abc');

Can you please help me understand, when to use N and not to use N before the quotes?

Thanks,

Dev

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 4

movvap (2/1/2016)
Thanks John and Ed!
I just ran the below query with out the N in the country + ' ' + coalesce(region, N' ') + ' '+ city as location
select custid, country, region, city
, country + ' ' + coalesce(region, ' ') + ' '+ city as location
from sales.Customers;
and I still get the correct result. I am also trying to understand the the below two select statments
select DATALENGTH(N'abc');
select DATALENGTH('abc');
Can you please help me understand, when to use N and not to use N before the quotes?
Thanks,
Dev

Any string constant enclosed in 'single quotes' is considered to be a varchar value. This uses 1 byte of storage per character and supports a limited character set (128 characters defined in the ASCII standard, plus 128 more based on a codepage-dependent collation).

Any string constant enclused in N'single quotes preceded by capital N' is considered to be an nvarchar value. This uses 2 bytes of storage per character and supports all characters defined in the Unicode UCS-2 character set.

When mixing varchar and nvarchar, SQL Server will use implicit conversion to bump everything to nvarchar, so you will get no errors and correct(*) results. However, as a best practice it is recommended to avoid such implicit conversions where possible.

(*) Not sure if the implicit conversion from varchar to nvarchar will always be correct for all possible characters in all supported codepages

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

esteinbrenner 61443 Valued Member Points: 58 More actions · Answer 5

As was mentioned the problem with your coalesce is the first N'',. Coalesce chooses the first thing that is not null and uses that and the N'' will never be null so it will always be chosen.

As far as which one to choose they are mostly equivalent and any of the three can be used. I tend to prefer isnull because it is easier to type than coalesce and imo is a little easier to read and understand as a simple null check. Coalesce has the advantage of being portable and should run on any major DB while isnull will not (oracle uses NVL for its equivalent of ISNULL). If you did do the case you have the advantage of being able to avoid multiple spaces in a row when the region is null. Probably not a big issue but it is a minor difference between the queries the way you wrote them. You can still accomplish that with isnull or coalesce by including the + ' ' inside the isnull like this ISNULL(region + ' ', '')

movvap Old Hand Points: 300 More actions · Answer 6

Thanks Hugo!

So, if I understand using N in front of the single quotes like N' ' is recommended as oppose to just the single quotes with out N?

I ran this example with and with out N in the len(replace(lastname, 'e','') and the result is same.

select empid,lastname,

len(lastname) - len(replace(lastname, N'e',N'')) as numoccurance

from hr.Employees;

Like in the above example using N in front of the single quotes makes sense?

Thanks,

Dev

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 7

movvap (2/1/2016)
Thanks Hugo!
So, if I understand using N in front of the single quotes like N' ' is recommended as oppose to just the single quotes with out N?
I ran this example with and with out N in the len(replace(lastname, 'e','') and the result is same.
select empid,lastname,
len(lastname) - len(replace(lastname, N'e',N'')) as numoccurance
from hr.Employees;
Like in the above example using N in front of the single quotes makes sense?
Thanks,
Dev

It depends on the data type of the lastname column. If that column has a data type nvarchar or nchar, then use N'e' and N'' so that all data types are unicode strings and no conversion is needed. If you are lazy, then the constants 'e' and '' will be converted from Ascii to Unicode, which will result in a (very tiny) overhead while compiling the query.

On the other hand, if lastname is char or varchar, then using 'e' and '' makes the query run without conversion; using N'e' and N'' will instead force SQL Server to first convert varchar to Unicode, then do the replace, and then convert the result back to Ascii - for every row. That will produce more overhead than the previous case (though still not really something to worry about).

In these two cases the impact is only performance. However there are other cases where the implicit conversions can impact the usability of an index, or even change a very efficient remote query to a terrible one (speaking from personal experience). So my advice is to stick to the generic best practice to avoid all implicit conversions, and to minimize the explicit conversions to the bare minimum.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Ed Wagner SSC Guru Points: 287026 More actions · Answer 8

Hugo Kornelis (2/1/2016)
In these two cases the impact is only performance. However there are other cases where the implicit conversions can impact the usability of an index, or even change a very efficient remote query to a terrible one (speaking from personal experience). So my advice is to stick to the generic best practice to avoid all implicit conversions, and to minimize the explicit conversions to the bare minimum.

Yeah, I've had that personal experience as well. Once you run it out to 1M rows or more, that performance impact is felt more and more. Please listen to Hugo's advice here and just avoid implicit conversions everywhere.

Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions