December 22, 2015 at 12:10 pm
ScottPletcher (12/22/2015)
Edit: Not at all "moving the goalposts". The only reason for the rule is the issue it causes if you get the type wrong, as I stated many times above. You're trying to move the goalposts by saying changes almost never occur, so the re-work involved is not significant. Even that may not be true -- it's certainly possible that changing one heavily-used column could cause a lot of re-work. Let me guess, then your answer is "don't change it". Ok, but then all that disk space is still wasted.
We are changing many columns for nvar to var, as I noted above; at one time there seemed to be a trend to use nvarchar for many more columns.
But what does it cost me to use a varchar literal instead of an nvarchar literal? Nothing. But if i get it wrong, the cost is huge. Not worth the risk.
Not to mention, I also have to know for every column whether it is varchar or nvarchar so I can write the data-type-specific literals for it. That effort, too, for no gain whatsoever.
Struggle to understand why one would have such frequent schema changes as you are implying, are you doing code first Entity Framework development?
😎
I can count the instances when I've had to change data types within schemas which I've designed on the fingers of one hand, rarely happens and when it does there is a change and impact management processes in place.
As I stated before, I've never seen correct data typing and the absence of an implicit conversion cause a problem, why on earth should one then suggest something that might?
December 22, 2015 at 12:28 pm
Eirikur Eiriksson (12/22/2015)
ScottPletcher (12/22/2015)
Edit: Not at all "moving the goalposts". The only reason for the rule is the issue it causes if you get the type wrong, as I stated many times above. You're trying to move the goalposts by saying changes almost never occur, so the re-work involved is not significant. Even that may not be true -- it's certainly possible that changing one heavily-used column could cause a lot of re-work. Let me guess, then your answer is "don't change it". Ok, but then all that disk space is still wasted.
We are changing many columns for nvar to var, as I noted above; at one time there seemed to be a trend to use nvarchar for many more columns.
But what does it cost me to use a varchar literal instead of an nvarchar literal? Nothing. But if i get it wrong, the cost is huge. Not worth the risk.
Not to mention, I also have to know for every column whether it is varchar or nvarchar so I can write the data-type-specific literals for it. That effort, too, for no gain whatsoever.
Struggle to understand why one would have such frequent schema changes as you are implying, are you doing code first Entity Framework development?
😎
I can count the instances when I've had to change data types within schemas which I've designed on the fingers of one hand, rarely happens and when it does there is a change and impact management processes in place.
As I stated before, I've never seen correct data typing and the absence of an implicit conversion cause a problem, why on earth should one then suggest something that might?
You've never converted a column from nvarchar to varchar? That's very unusual. It usually comes up at least a few times, particularly in older systems. Not for the system tables, those should never change, but many columns in user tables can save space. I've even seen ids and codes that are unicode for who knows what reason.
It's not about "absence of an implicit conversion". It's specifically about coding N' on char literals. Why all the extreme vitriol that "it must be done"? Just to follow a general "rule" about "always" using the "matching" data type? I strongly suspect that's not done for all other data types anyway.
Even then, experts are supposed to know when general rules don't work best and adjust them as needed.
Seriously, what is the downside to leaving off the N'? I see none, whether the column is char or nchar. But I see a huge downside to explicitly specifying it when the table column is char only.
So would you reject code for production that was like this:
SELECT * FROM sys.tables WHERE name = 'abc'
In this case, I would not object to:
SELECT * FROM sys.tables WHERE name = N'abc'
because the system tables should truly always be nvarchar. But I also wouldn't object to the first style of coding. Would you force it to be "corrected" before going into production?
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
December 22, 2015 at 3:11 pm
ScottPletcher (12/22/2015)
Lynn Pettis (12/22/2015)
ScottPletcher (12/22/2015)
Lynn Pettis (12/22/2015)
ScottPletcher (12/22/2015)
Lynn Pettis (12/22/2015)
ScottPletcher (12/22/2015)
Of course when scanning the whole table the difference may be rather minor. It's the prevention of seeks that is always the major concern with implicit column conversions. I had to use a permanent tally table in the code below, but naturally you can use an in-line one if you prefer:
SET STATISTICS IO OFF
CREATE TABLE #tbl1 ( col1 varchar(7) PRIMARY KEY WITH ( FILLFACTOR = 100 ) )
GO
INSERT INTO #tbl1
SELECT RIGHT(REPLICATE('0', 7) + CAST(t.tally AS varchar(7)), 7)
FROM dbo.tally t
WHERE t.tally BETWEEN 1 AND 1000000
GO
SET STATISTICS IO ON
SELECT t1.* FROM #tbl1 t1 WHERE t1.col1 = '0987654' --3 logical I/Os
SELECT t1.* FROM #tbl1 t1 WHERE t1.col1 = N'0987654' --2484 logical I/Os
Of course there were more logical I/Os in the second query. You are using the wrong data type. You are comparing a literal NVARCHAR string to a value in a VARCHAR column which requires SQL Server to implicitly convert EACH AND EVERY varchar value to an nvarchar value before doing the compare.
If you change the data type in the table to nvarchar you get 3 logical reads in both queries. This is because SQL Server will do an implicit conversion of the literal varchar string to nvarchar ONCE, and then use the index to find the value.
The moral here is actually to use the correct data type in your queries.
I stated exactly the first part earlier.
As to the moral, I strongly disagree in the case of unicode. Do not explicitly specify it unless you have to. SQL implicitly converting a few literals is no issue at all, but forcing unicode when you shouldn't can be a major performance killer.
And that is where I will have to disagree with you. If the data type is declared as NVARCHAR or NCHAR then you should use N'astring' when specifying a Unicode literal string whether it contains actual Unicode values or not. If you don't want to that, that is your choice.
But then if you change the column back to varchar, as we and many others are doing to save space where possible, you have to go back and change all the code. The potential gain from using N' is effectively zero, the potential loss is huge, in performance and in re-work. That is indeed your choice, but at this company, we don't have time for useless re-work, we really don't.
You know what, I don't give a rats tails *** what you do. I will use the appropriate data types. Don't tell me I am wrong when I do.
Well, I guess it's not if you're a contractor, particularly a govt one. In that case, "standard practice" seems to be to build bugs into the code that you can correct later for an additional fee.
I do work for a company that is a government contractor, and my problem is correcting SQL code that is performance/scalability hindered because developers weren't using proper data types. Yes, there may be exceptions, but I'd rather deal with those than have developers who think they know what the exceptions are dealing with it. My biggest problem is that the developers don't use me as resource like they should and I am not directly involved in the development process and the developers do their own code reviews and I don't become aware of the issues until the code has gone through all the necessary government testing/approval processes and makes it into production.
People new to SQL Server and T-SQL should learn to do things correctly and then learn where the exceptions are as they get more experienced.
December 22, 2015 at 3:47 pm
ScottPletcher (12/22/2015)
Seriously, what is the downside to leaving off the N'? I see none, whether the column is char or nchar. But I see a huge downside to explicitly specifying it when the table column is char only.So would you reject code for production that was like this:
SELECT * FROM sys.tables WHERE name = 'abc'
In this case, I would not object to:
SELECT * FROM sys.tables WHERE name = N'abc'
because the system tables should truly always be nvarchar. But I also wouldn't object to the first style of coding. Would you force it to be "corrected" before going into production?
Seriously, the world is much bigger than the little scope you live in.
Can you imagine, sysname is a Unicode data type because SQL Server is sold in many countries where people use non-English identifiers for objects, columns, parameters, etc.
Try this:
IF OBJECT_ID(N'???????') IS NOT NULL DROP TABLE [???????]
CREATE TABLE [???????] (
[???????1] INT,
[???????2] NVARCHAR(50)
)
SELECT OBJECT_ID('???????')
SELECT OBJECT_ID(N'???????')
SELECT * FROM sys.tables WHERE name = '???????'
SELECT * FROM sys.tables WHERE name = N'???????'
Remove the "N" from the initial check and re-run the script.
How does it go?
Does it give you a hint?
_____________
Code for TallyGenerator
December 22, 2015 at 3:48 pm
What is the difference how fast your query returns a result if you cannot be sure if it's correct?
_____________
Code for TallyGenerator
December 22, 2015 at 3:50 pm
Lynn Pettis (12/22/2015)
ScottPletcher (12/22/2015)
Lynn Pettis (12/22/2015)
ScottPletcher (12/22/2015)
Lynn Pettis (12/22/2015)
ScottPletcher (12/22/2015)
Lynn Pettis (12/22/2015)
ScottPletcher (12/22/2015)
Of course when scanning the whole table the difference may be rather minor. It's the prevention of seeks that is always the major concern with implicit column conversions. I had to use a permanent tally table in the code below, but naturally you can use an in-line one if you prefer:
SET STATISTICS IO OFF
CREATE TABLE #tbl1 ( col1 varchar(7) PRIMARY KEY WITH ( FILLFACTOR = 100 ) )
GO
INSERT INTO #tbl1
SELECT RIGHT(REPLICATE('0', 7) + CAST(t.tally AS varchar(7)), 7)
FROM dbo.tally t
WHERE t.tally BETWEEN 1 AND 1000000
GO
SET STATISTICS IO ON
SELECT t1.* FROM #tbl1 t1 WHERE t1.col1 = '0987654' --3 logical I/Os
SELECT t1.* FROM #tbl1 t1 WHERE t1.col1 = N'0987654' --2484 logical I/Os
Of course there were more logical I/Os in the second query. You are using the wrong data type. You are comparing a literal NVARCHAR string to a value in a VARCHAR column which requires SQL Server to implicitly convert EACH AND EVERY varchar value to an nvarchar value before doing the compare.
If you change the data type in the table to nvarchar you get 3 logical reads in both queries. This is because SQL Server will do an implicit conversion of the literal varchar string to nvarchar ONCE, and then use the index to find the value.
The moral here is actually to use the correct data type in your queries.
I stated exactly the first part earlier.
As to the moral, I strongly disagree in the case of unicode. Do not explicitly specify it unless you have to. SQL implicitly converting a few literals is no issue at all, but forcing unicode when you shouldn't can be a major performance killer.
And that is where I will have to disagree with you. If the data type is declared as NVARCHAR or NCHAR then you should use N'astring' when specifying a Unicode literal string whether it contains actual Unicode values or not. If you don't want to that, that is your choice.
But then if you change the column back to varchar, as we and many others are doing to save space where possible, you have to go back and change all the code. The potential gain from using N' is effectively zero, the potential loss is huge, in performance and in re-work. That is indeed your choice, but at this company, we don't have time for useless re-work, we really don't.
You know what, I don't give a rats tails *** what you do. I will use the appropriate data types. Don't tell me I am wrong when I do.
Well, I guess it's not if you're a contractor, particularly a govt one. In that case, "standard practice" seems to be to build bugs into the code that you can correct later for an additional fee.
I do work for a company that is a government contractor, and my problem is correcting SQL code that is performance/scalability hindered because developers weren't using proper data types. Yes, there may be exceptions, but I'd rather deal with those than have developers who think they know what the exceptions are dealing with it. My biggest problem is that the developers don't use me as resource like they should and I am not directly involved in the development process and the developers do their own code reviews and I don't become aware of the issues until the code has gone through all the necessary government testing/approval processes and makes it into production.
People new to SQL Server and T-SQL should learn to do things correctly and then learn where the exceptions are as they get more experienced.
Interesting approach.
I believe not using unicode literals except when required is "do[ing] things correctly". In fact, I provide a list of rules (required) and guidelines (optional but preferred) for SQL developers, as I don't have time to meet with each one. Rules such as:
-- Don't use a unicode literal unless required to.
-- When providing literal dates, code them as 'YYYYMMDD', datetimes as 'YYYYMMDD hh:mm:ss[.ssssss]", where hh = 24 hour-based.
-- Never use functions in a WHERE or JOIN unless absolutely necessary. Never use ISNULL in a WHERE or JOIN, period, as it can be always be effectively coded around.
-- Whenever possible, specify a clustering key value or range, even if it seems unnecessary, as it will help the optimizer produce the best query plan.
And so on.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
December 22, 2015 at 3:56 pm
Sergiy (12/22/2015)
ScottPletcher (12/22/2015)
Seriously, what is the downside to leaving off the N'? I see none, whether the column is char or nchar. But I see a huge downside to explicitly specifying it when the table column is char only.So would you reject code for production that was like this:
SELECT * FROM sys.tables WHERE name = 'abc'
In this case, I would not object to:
SELECT * FROM sys.tables WHERE name = N'abc'
because the system tables should truly always be nvarchar. But I also wouldn't object to the first style of coding. Would you force it to be "corrected" before going into production?
Seriously, the world is much bigger than the little scope you live in.
Can you imagine, sysname is a Unicode data type because SQL Server is sold in many countries where people use non-English identifiers for objects, columns, parameters, etc.
Try this:
IF OBJECT_ID(N'???????') IS NOT NULL DROP TABLE [???????]
CREATE TABLE [???????] (
[???????1] INT,
[???????2] NVARCHAR(50)
)
SELECT OBJECT_ID('???????')
SELECT OBJECT_ID(N'???????')
SELECT * FROM sys.tables WHERE name = '???????'
SELECT * FROM sys.tables WHERE name = N'???????'
Remove the "N" from the initial check and re-run the script.
How does it go?
Does it give you a hint?
No, it's your usual non sequiturs. I very clearly stated that when N' is required, it should be used. When extended characters are present, it will obviously be required. If it's possible extended chars will actually be present, unicode should be used. But when it's a hard-coded literal without extended chars -- which is what the actual example was and the discussion was about -- then using N' can cause serious issues but doesn't gain anything.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
December 22, 2015 at 3:56 pm
Sergiy (12/22/2015)
ScottPletcher (12/22/2015)
Seriously, what is the downside to leaving off the N'? I see none, whether the column is char or nchar. But I see a huge downside to explicitly specifying it when the table column is char only.So would you reject code for production that was like this:
SELECT * FROM sys.tables WHERE name = 'abc'
In this case, I would not object to:
SELECT * FROM sys.tables WHERE name = N'abc'
because the system tables should truly always be nvarchar. But I also wouldn't object to the first style of coding. Would you force it to be "corrected" before going into production?
Seriously, the world is much bigger than the little scope you live in.
Can you imagine, sysname is a Unicode data type because SQL Server is sold in many countries where people use non-English identifiers for objects, columns, parameters, etc.
Try this:
IF OBJECT_ID(N'???????') IS NOT NULL DROP TABLE [???????]
CREATE TABLE [???????] (
[???????1] INT,
[???????2] NVARCHAR(50)
)
SELECT OBJECT_ID('???????')
SELECT OBJECT_ID(N'???????')
SELECT * FROM sys.tables WHERE name = '???????'
SELECT * FROM sys.tables WHERE name = N'???????'
Remove the "N" from the initial check and re-run the script.
How does it go?
Does it give you a hint?
No, it's your usual non sequiturs. I very clearly stated that when N' is required, it should be used. When extended characters are present, it will obviously be required. If it's possible extended chars will actually be present, unicode should be used. But when it's a hard-coded literal without extended chars -- which is what the actual example was and the discussion was about -- then using N' can cause serious issues but doesn't gain anything.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
December 22, 2015 at 4:17 pm
ScottPletcher (12/22/2015)
Lynn Pettis (12/22/2015)
ScottPletcher (12/22/2015)
Lynn Pettis (12/22/2015)
ScottPletcher (12/22/2015)
Lynn Pettis (12/22/2015)
ScottPletcher (12/22/2015)
Lynn Pettis (12/22/2015)
ScottPletcher (12/22/2015)
Of course when scanning the whole table the difference may be rather minor. It's the prevention of seeks that is always the major concern with implicit column conversions. I had to use a permanent tally table in the code below, but naturally you can use an in-line one if you prefer:
SET STATISTICS IO OFF
CREATE TABLE #tbl1 ( col1 varchar(7) PRIMARY KEY WITH ( FILLFACTOR = 100 ) )
GO
INSERT INTO #tbl1
SELECT RIGHT(REPLICATE('0', 7) + CAST(t.tally AS varchar(7)), 7)
FROM dbo.tally t
WHERE t.tally BETWEEN 1 AND 1000000
GO
SET STATISTICS IO ON
SELECT t1.* FROM #tbl1 t1 WHERE t1.col1 = '0987654' --3 logical I/Os
SELECT t1.* FROM #tbl1 t1 WHERE t1.col1 = N'0987654' --2484 logical I/Os
Of course there were more logical I/Os in the second query. You are using the wrong data type. You are comparing a literal NVARCHAR string to a value in a VARCHAR column which requires SQL Server to implicitly convert EACH AND EVERY varchar value to an nvarchar value before doing the compare.
If you change the data type in the table to nvarchar you get 3 logical reads in both queries. This is because SQL Server will do an implicit conversion of the literal varchar string to nvarchar ONCE, and then use the index to find the value.
The moral here is actually to use the correct data type in your queries.
I stated exactly the first part earlier.
As to the moral, I strongly disagree in the case of unicode. Do not explicitly specify it unless you have to. SQL implicitly converting a few literals is no issue at all, but forcing unicode when you shouldn't can be a major performance killer.
And that is where I will have to disagree with you. If the data type is declared as NVARCHAR or NCHAR then you should use N'astring' when specifying a Unicode literal string whether it contains actual Unicode values or not. If you don't want to that, that is your choice.
But then if you change the column back to varchar, as we and many others are doing to save space where possible, you have to go back and change all the code. The potential gain from using N' is effectively zero, the potential loss is huge, in performance and in re-work. That is indeed your choice, but at this company, we don't have time for useless re-work, we really don't.
You know what, I don't give a rats tails *** what you do. I will use the appropriate data types. Don't tell me I am wrong when I do.
Well, I guess it's not if you're a contractor, particularly a govt one. In that case, "standard practice" seems to be to build bugs into the code that you can correct later for an additional fee.
I do work for a company that is a government contractor, and my problem is correcting SQL code that is performance/scalability hindered because developers weren't using proper data types. Yes, there may be exceptions, but I'd rather deal with those than have developers who think they know what the exceptions are dealing with it. My biggest problem is that the developers don't use me as resource like they should and I am not directly involved in the development process and the developers do their own code reviews and I don't become aware of the issues until the code has gone through all the necessary government testing/approval processes and makes it into production.
People new to SQL Server and T-SQL should learn to do things correctly and then learn where the exceptions are as they get more experienced.
Interesting approach.
I believe not using unicode literals except when required is "do[ing] things correctly". In fact, I provide a list of rules (required) and guidelines (optional but preferred) for SQL developers, as I don't have time to meet with each one. Rules such as:
-- Don't use a unicode literal unless required to.
-- When providing literal dates, code them as 'YYYYMMDD', datetimes as 'YYYYMMDD hh:mm:ss[.ssssss]", where hh = 24 hour-based.
-- Never use functions in a WHERE or JOIN unless absolutely necessary. Never use ISNULL in a WHERE or JOIN, period, as it can be always be effectively coded around.
-- Whenever possible, specify a clustering key value or range, even if it seems unnecessary, as it will help the optimizer produce the best query plan.
And so on.
You do it your way, that's fine. If a string literal is being assigned to a Unicode variable or column or is being compared to a Unicode column, I will precede it with N' so that others seeing the code will know that the value needs to be Unicode. I don't care if the initial value itself isn't, the possibility that it might have to be in the future is enough to do it right from the start.
Unfortunately I have little say in the development process or I would provide such guidelines for the developers. If asked, I will help in the writing of SQL code. If not, not much I can do until it gets into production and I get called to help fix the problem(s) from poorly written code.
December 22, 2015 at 4:56 pm
ScottPletcher (12/22/2015)
But when it's a hard-coded literal
If it's in production code - the developer who did it must be sacked.
If it's in a dynamic SQL - you cannot know what kind of literal will be placed there tomorrow. Must use "N" prefix to prevent errors.
_____________
Code for TallyGenerator
December 22, 2015 at 4:58 pm
Sergiy (12/22/2015)
What is the difference how fast your query returns a result if you cannot be sure if it's correct?
😉
faster the better...........oops
😎
December 23, 2015 at 9:12 am
Sergiy (12/22/2015)
ScottPletcher (12/22/2015)
But when it's a hard-coded literalIf it's in production code - the developer who did it must be sacked.
If it's in a dynamic SQL - you cannot know what kind of literal will be placed there tomorrow. Must use "N" prefix to prevent errors.
I don't agree with your first statement. I use hardcoded literals in code at times. It may be to provide default values in optional parameters in a stored procedure, or it could be to restrict a result set to specific data where there may be multiple types or codes and the requirement is always for specific types or codes.
Viewing 12 posts - 31 through 41 (of 41 total)
You must be logged in to reply to this topic. Login to reply