January 2, 2014 at 9:32 am
There's a use case when STR() works better/is easier than other methods of converting numbers to strings - when the numbers are already typed as float or real. CAST and CONVERT have some limitations on converting float and real types to strings (scroll down to Remarks and see the second subheading, "float and real Styles").
I recently wrote about a real-world use case where STR() avoided a problem caused by these limitations of CAST and CONVERT with float and real values.[/url] While the problem could be solved by converting the float/real values to decimal values and then converting them to strings, the STR() function does the same thing without having to code the nested conversions.
Jason Wolfkill
January 2, 2014 at 9:47 am
Jeff Moden (1/9/2011)
Mike McIver (1/9/2011)
Really . . . another ridiculous article! I am responding to this reader's reply to illustrate the problem. Misinformation breeds misinformation . . .
Jeff Moden (1/9/2011)
BWAA-HAA!!!... having a bad day, Mike?Take the same advice you dish out so readily. Look at things from a different point of view. 😉
Wow....
Someone corrects an obvious mistake in your article that that you made, points out your misinformation, and your response is to mock/laugh at them? Did you really accuse this person of not being able to understand a higher purpose of your misinformation? Did you just Chastise them for not understanding your poor word choice or poor research or your own subject matter, as them not seeing things from a different point of view?
This helps me see you from a different point of view Jeff. Very different... :w00t:
January 2, 2014 at 10:41 am
I was burned by this 8 years ago.....never again. Caffiene-deprived, Xanax induced, secondhand THC inhalation.....whatever the state (j/k on the chemical reference--maybe). What is amazing to me is how an article, created to help the SQL community, could incite such scathing grandstanding.
You were quite clear in the beginning of your article regarding your topic; so, don't let the negative feedback get you down. We just cannot please all the people all of the time. There is NOTHING ridiculous about this article.
January 2, 2014 at 10:43 am
PHYData DBA (1/2/2014)
Jeff Moden (1/9/2011)
Mike McIver (1/9/2011)
Really . . . another ridiculous article! I am responding to this reader's reply to illustrate the problem. Misinformation breeds misinformation . . .Jeff Moden (1/9/2011)
BWAA-HAA!!!... having a bad day, Mike?Take the same advice you dish out so readily. Look at things from a different point of view. 😉
Wow....
Someone corrects an obvious mistake in your article that that you made, points out your misinformation, and your response is to mock/laugh at them? Did you really accuse this person of not being able to understand a higher purpose of your misinformation? Did you just Chastise them for not understanding your poor word choice or poor research or your own subject matter, as them not seeing things from a different point of view?
This helps me see you from a different point of view Jeff. Very different... :w00t:
Whoa, easy there, PHYData DBA. No need to heat up a three year old flame war. Jeff may have caused some confusion with the phrase "bad rounding", but he made a valid point about the counterintuitive behavior of the STR() function in some cases (a point that obviously enlightened many readers, as previous posts in this thread demonstrate) even if he misapprehended or misrepresented the cause of that behavior. That mistake warrants a correction based on facts, such as your previous post in this thread that pointed out that STR() doesn't round "badly", it just uses the simplest and least accurate method of rounding a float value. I learned something from your post. Mike McIver, on the other hand, dished out a huge helping of sarcasm with a clear insinuation that the author of the article must be stupid, but never even explained *why* the phrase "bad rounding" was wrong. That kind of post comes off as nothing more than the poster's attempt to assert his superiority and doesn't contribute anything of value to the community. I can't blame Jeff for being a little prickly in response to an attack like that, especially considering that he is one of the most scrupulous members of this forum about backing up his statements with real evidence and data. We would all do well to remember that the purpose of these forums is to share knowledge and that petty bickering defeats that purpose.
Jason Wolfkill
January 2, 2014 at 10:47 am
PHYData DBA (1/2/2014)
Jeff Moden (1/9/2011)
Mike McIver (1/9/2011)
Really . . . another ridiculous article! I am responding to this reader's reply to illustrate the problem. Misinformation breeds misinformation . . .Jeff Moden (1/9/2011)
BWAA-HAA!!!... having a bad day, Mike?Take the same advice you dish out so readily. Look at things from a different point of view. 😉
Wow....
Someone corrects an obvious mistake in your article that that you made, points out your misinformation, and your response is to mock/laugh at them? Did you really accuse this person of not being able to understand a higher purpose of your misinformation? Did you just Chastise them for not understanding your poor word choice or poor research or your own subject matter, as them not seeing things from a different point of view?
This helps me see you from a different point of view Jeff. Very different... :w00t:
I never take exception to constructive criticism. My laugh-off was to summarily dismiss words like "ridiculous" and the general tone of that particular post.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 2, 2014 at 10:55 am
I wish that the old SQL functions would be updated with the times to take into account larger data types.
I know that if you change the behaviour of a function then there are huge implications for legacy systems that depend on a functional quirk.
SQL Server isn't alone in this. SAS Analytics handles BIGINT values in one of two ways.
1. Convert them to a string
2. Discard the least significant digits just as STR()
January 2, 2014 at 9:17 pm
"We all know that formatting data in SQL Server is your basic Bozo-no-no".
No we don't. That's a Luddite position. T-SQL has a fairly rich set of string-manipulation functionality. It's there because it's needed. Use it when appropriate. Extend it with your own user-defined functions whenever you need to. You just need to be aware that it can be relatively slow, compared to other database operations, so if it's going into production you should test it for speed and scalability. If necessary, avoid using it in WHERE clauses, for that's where it would be executed most often, and where it also could well interfere with query optimization. Apply it to the the data flow at a point where it will be executed the least number of times. Use your head. Horses for courses. Don't use a hammer on a screw. Apply the right tool to the job. A blanket "never do it" is as blunt a rule as "use it all the time."
January 2, 2014 at 9:32 pm
Thanks yet again for an insightful article (probably one of the few of your I hadn't read prior). Indeed, a crack was filled.
I was curious what the SQL 2012 FORMAT function might do for these cases:
SELECT 9223372036854775807, STR(9223372036854775807,19), FORMAT(9223372036854775807,'G') UNION ALL
SELECT 922337203685477580, STR(922337203685477580,19), FORMAT(922337203685477580,'G') UNION ALL
SELECT 92233720368547758, STR(92233720368547758,19), FORMAT(92233720368547758,'G') UNION ALL
SELECT 9223372036854775, STR(9223372036854775,19), FORMAT(9223372036854775,'G') UNION ALL
SELECT 922337203685477, STR(922337203685477,19), FORMAT(922337203685477,'G');
SELECT .9223372036854775807, STR(.9223372036854775807,19,19), FORMAT(.9223372036854775807,'G') UNION ALL
SELECT .922337203685477580, STR(.922337203685477580,19,19), FORMAT(.922337203685477580,'G') UNION ALL
SELECT .92233720368547758, STR(.92233720368547758,19,19), FORMAT(.92233720368547758,'G') UNION ALL
SELECT .9223372036854775, STR(.9223372036854775,19,19), FORMAT(.9223372036854775,'G') UNION ALL
SELECT .922337203685477, STR(.922337203685477,19,19), FORMAT(.922337203685477,'G');
Those all seemed OK. I also ran the test using FORMAT(ROUND(SomeNumber,2),'G') in your cteConversions script and that seemed OK also.
Now for the surprise. Drum roll please!
--===== Conditionally drop and rebuild a test table in a nice safe place
-- that everyone has. This only takes several seconds.
IF OBJECT_ID('tempdb..#MyHead','U') IS NOT NULL
DROP TABLE #MyHead
;
SELECT TOP 1000000
SomeNumber = IDENTITY(INT,1,1)
INTO #MyHead
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
;
--===== Declare a variable which is used to take display time out of the picture
DECLARE @Bitbucket NVARCHAR(10)
;
--===== Identify the test and enable the timer for display
PRINT '========== Cast, Concatenate, and Size ==========';
SET STATISTICS TIME ON
;
--===== Slightly more complex method for right justification is more than
-- twice as fast as STR() formatting
SELECT @Bitbucket = RIGHT('0000000000' + CAST(SomeNumber AS VARCHAR(10)),10)
FROM #MyHead
;
SET STATISTICS TIME OFF
;
--===== Identify the test and enable the timer for display
PRINT '========== The STR() Method is SLOWER ==========';
SET STATISTICS TIME ON
;
SELECT @Bitbucket = STR(SomeNumber,10)
FROM #MyHead
;
--===== Turn the "clock" off
SET STATISTICS TIME OFF
;
--===== Identify the test and enable the timer for display
PRINT '========== The FORMAT() Method is WAY, WAY SLOWER ==========';
SET STATISTICS TIME ON
;
SELECT @Bitbucket = FORMAT(SomeNumber,'G')
FROM #MyHead
;
--===== Turn the "clock" off
SET STATISTICS TIME OFF
;
Got these timing results (on my somewhat more modern laptop):
(1000000 row(s) affected)
========== Cast, Concatenate, and Size ==========
SQL Server Execution Times:
CPU time = 328 ms, elapsed time = 327 ms.
========== The STR() Method is SLOWER ==========
SQL Server Execution Times:
CPU time = 842 ms, elapsed time = 846 ms.
========== The FORMAT() Method is WAY, WAY SLOWER ==========
SQL Server Execution Times:
CPU time = 35038 ms, elapsed time = 40613 ms.
Aaack! :w00t: Why am I not particularly surprised?
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
January 2, 2014 at 9:56 pm
PHYData DBA (1/2/2014)
Jeff Moden (1/9/2011)
Mike McIver (1/9/2011)
Really . . . another ridiculous article! I am responding to this reader's reply to illustrate the problem. Misinformation breeds misinformation . . .Jeff Moden (1/9/2011)
BWAA-HAA!!!... having a bad day, Mike?Take the same advice you dish out so readily. Look at things from a different point of view. 😉
Wow....
Someone corrects an obvious mistake in your article that that you made, points out your misinformation, and your response is to mock/laugh at them? Did you really accuse this person of not being able to understand a higher purpose of your misinformation? Did you just Chastise them for not understanding your poor word choice or poor research or your own subject matter, as them not seeing things from a different point of view?;-)
Ok… you deserve a better answer than the one I just gave.
This article wasn't directed towards the scientific community and it should be pretty obvious to everyone that it wasn't. It was directed towards people that might (like I once did a long time ago) be making the choice to use STR() to format decimal numbers in their everyday reporting requirements. As I was when I first started out, many of them aren't aware of any rounding methods other than "5 rounds up" never mind the five (that I know of) recognized methods of rounding contained in IEEE754, Bankers Rounding, and the proverbial "Except on Tuesdays" rounding. For them, there is no "obvious mistake" in the article. For them (and for me), the information in this article will help keep them (us) out of trouble.
The title of the article is "Hidden Formatting Troubles with STR()" and it explains precisely why it's a problem for the calculations that the targeted audience uses every day and it was demonstrated using repeatable code to show them the kinds of problems they might run into because of the mixing of decimal expectations with an underlying FLOAT datatype hidden in a function. The article does NOT denounce FLOAT as a datatype (although BCD avoids some of those problems while causing MUCH less efficient storage) and it certainly wasn't meant to nor does it contradict an "army of PhD Scientists and Mathematicians". It simply shows what can and will happen when you unknowingly use a function of one datatype to produce an answer in a different expected datatype and it does so in terms that everyone can understand.
When improperly used against decimal expectations, it does, in fact, produce rounded results that are incorrect according to decimal expectations of many that seek a formatting tool/function.
This helps me see you from a different point of view Jeff. Very different... :w00t:
Although technically correct, the "other's" first comment was ugly, unnecessarily brutal and, considering that the supposedly "higher point" was apparently missed in both the obvious title and content of the article by that person, I decided to blow it off rather than respond in a similar fashion nor waste my time arguing with someone that is convinced that their troll-like response was somehow appropriate.
Considering the fact that you have just retorted with choice morsels such as "poor word choice", "misinformation", "poor research", and "higher purpose of your misinformation", you might want to choose a different stone to throw. Besides, despite your sudden righteous indignation to it all, it couldn't be that much "misinformation" because you apparently got the gist of the article…
PHYData DBA (1/2/2014)
Jeff - Great article. I never liked STR() in any language. Hope that it just gets deprecated since there are many better ways to do what it does.
Had that other person also responded in the calm and professional manner that the rest of your post consisted of…
I would like to point out that contrary to the wording in your article STR is not actually rounding incorrectly. It is using the simplest and least accurate mode of rounding a Float that IEEE754 allows; round to zero. This mode gives us the common behavior experienced with float-to-integer conversions in mathematics. This is from the Wikipedia (not the best source on earth but one everybody can get to) article on the subject.
round toward zero (truncation; it is similar to the common behavior of float-to-integer conversions, which convert -3.9 to -3 and 3.9 to 3)
http://en.wikipedia.org/wiki/Floating_point#Rounding_modes
I am not sure why anyone would have a function use this rounding mode for float values without documenting it very well. Unless the of course the person that wrote this TSQL function had no control over its documentation
… my response would have been quite different. It would have been more like…
Excellent information and a great link that leads to many others. It shows just exactly the problem that I'm talking about… if you unwittingly mix datatypes, you can easily be subject to both truncation and rounding that doesn't meet your expectations. It also gives a great explanation for the "imprecision" of the FLOAT and REAL datatypes and why it is acceptable in many cases. In such cases, the STR() function can, indeed, be the appropriate formatting tool.
… but I didn't have the chance until now. 😉
And lest anyone misunderstand, I did do my BOL homework. There is no place in the article where I even hinted that STR is indeterminate. That comment was directed at one of the other respondents to this thread although I strongly believe that person was using the word quite differently than the way it was taken.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 2, 2014 at 10:19 pm
wolfkillj (1/2/2014)
There's a use case when STR() works better/is easier than other methods of converting numbers to strings - when the numbers are already typed as float or real. CAST and CONVERT have some limitations on converting float and real types to strings (scroll down to Remarks and see the second subheading, "float and real Styles").I recently wrote about a real-world use case where STR() avoided a problem caused by these limitations of CAST and CONVERT with float and real values.[/url] While the problem could be solved by converting the float/real values to decimal values and then converting them to strings, the STR() function does the same thing without having to code the nested conversions.
I'm currently working from my 2005 machine, so I couldn't test your good examples but that's one heck of a nice article you have there, Jason. I'll be reading the next article that you provided a link for, as well.
(From your article)
This wrong but plausible result may make it much harder to spot the issue.
THOSE are the words that I was looking for.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 2, 2014 at 11:25 pm
Scott Abrants (1/2/2014)
Jeff,Outstanding article, nicely done with great examples.
I am certain this misuse is rampant in the real world and your examples help clarify where to look for these pitfalls.
Thanks for the great feedback, Scott. Considering how many good folks on this thread said they didn't even know about STR(), I'm not sure that it's used widely as a formatting tool but, as you said, at least they'll know what kind of problems they can run into if they try to use it to display decimal numbers especially when it comes to what a lot of people consider to be correct decimal rounding.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 3, 2014 at 12:06 am
sneumersky (1/2/2014)
I was burned by this 8 years ago.....never again. Caffiene-deprived, Xanax induced, secondhand THC inhalation.....whatever the state (j/k on the chemical reference--maybe). What is amazing to me is how an article, created to help the SQL community, could incite such scathing grandstanding.You were quite clear in the beginning of your article regarding your topic; so, don't let the negative feedback get you down. We just cannot please all the people all of the time. There is NOTHING ridiculous about this article.
I have to admit that I'm sometimes surprised at the ad hominem nature and almost malevolent tone of some responses but my biggest concern is that I hope such negativity doesn't put others off from the message of the article.
Thank you very much for the encouragement and for the feedback on the article.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 3, 2014 at 8:55 am
jimgawn (1/2/2014)
"We all know that formatting data in SQL Server is your basic Bozo-no-no".No we don't. That's a Luddite position. T-SQL has a fairly rich set of string-manipulation functionality. It's there because it's needed. Use it when appropriate. Extend it with your own user-defined functions whenever you need to. You just need to be aware that it can be relatively slow, compared to other database operations, so if it's going into production you should test it for speed and scalability. If necessary, avoid using it in WHERE clauses, for that's where it would be executed most often, and where it also could well interfere with query optimization. Apply it to the the data flow at a point where it will be executed the least number of times. Use your head. Horses for courses. Don't use a hammer on a screw. Apply the right tool to the job. A blanket "never do it" is as blunt a rule as "use it all the time."
Ah... now there's a mistake that I truly made. The word "know" in the statement you quoted should be in quotes because I do formatting in T-SQL server a lot, especially for output to files that require a certain format. I also do HTML formatting using FOR XML PATH() to generate my morning reports (FOR XML PATH() is REAL handy for generating tr/td sets) as well as certain emails for some of the "C" level folks that like things "purdy". As a bit of a sidebar, the reports that are generated this way are very nearly instantaneous and I don't need to make a trip to SSRS to do it.
The reason I included that statement was to surpress the expected flurry of responses saying that "you shouldn't format in T-SQL". Good responses like yours are also why I included the part you left out in that quote.
...but we sometimes have to do some formatting to output to a file or for some other non-GUI related reason.
So I absolutely agree... saying that you should never format data in T-SQL is a bit of a Luddite position. As with all else in T-SQL, "It Depends".
Shifting gears a bit, permanent table storage of things like formatted dates is something that I avoid like the plague but even that depends on what needs to be accomplished and why.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 3, 2014 at 9:06 am
dwain.c (1/2/2014)
Thanks yet again for an insightful article (probably one of the few of your I hadn't read prior). Indeed, a crack was filled.I was curious what the SQL 2012 FORMAT function might do for these cases:
SELECT 9223372036854775807, STR(9223372036854775807,19), FORMAT(9223372036854775807,'G') UNION ALL
SELECT 922337203685477580, STR(922337203685477580,19), FORMAT(922337203685477580,'G') UNION ALL
SELECT 92233720368547758, STR(92233720368547758,19), FORMAT(92233720368547758,'G') UNION ALL
SELECT 9223372036854775, STR(9223372036854775,19), FORMAT(9223372036854775,'G') UNION ALL
SELECT 922337203685477, STR(922337203685477,19), FORMAT(922337203685477,'G');
SELECT .9223372036854775807, STR(.9223372036854775807,19,19), FORMAT(.9223372036854775807,'G') UNION ALL
SELECT .922337203685477580, STR(.922337203685477580,19,19), FORMAT(.922337203685477580,'G') UNION ALL
SELECT .92233720368547758, STR(.92233720368547758,19,19), FORMAT(.92233720368547758,'G') UNION ALL
SELECT .9223372036854775, STR(.9223372036854775,19,19), FORMAT(.9223372036854775,'G') UNION ALL
SELECT .922337203685477, STR(.922337203685477,19,19), FORMAT(.922337203685477,'G');
Those all seemed OK. I also ran the test using FORMAT(ROUND(SomeNumber,2),'G') in your cteConversions script and that seemed OK also.
Now for the surprise. Drum roll please!
--===== Conditionally drop and rebuild a test table in a nice safe place
-- that everyone has. This only takes several seconds.
IF OBJECT_ID('tempdb..#MyHead','U') IS NOT NULL
DROP TABLE #MyHead
;
SELECT TOP 1000000
SomeNumber = IDENTITY(INT,1,1)
INTO #MyHead
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
;
--===== Declare a variable which is used to take display time out of the picture
DECLARE @Bitbucket NVARCHAR(10)
;
--===== Identify the test and enable the timer for display
PRINT '========== Cast, Concatenate, and Size ==========';
SET STATISTICS TIME ON
;
--===== Slightly more complex method for right justification is more than
-- twice as fast as STR() formatting
SELECT @Bitbucket = RIGHT('0000000000' + CAST(SomeNumber AS VARCHAR(10)),10)
FROM #MyHead
;
SET STATISTICS TIME OFF
;
--===== Identify the test and enable the timer for display
PRINT '========== The STR() Method is SLOWER ==========';
SET STATISTICS TIME ON
;
SELECT @Bitbucket = STR(SomeNumber,10)
FROM #MyHead
;
--===== Turn the "clock" off
SET STATISTICS TIME OFF
;
--===== Identify the test and enable the timer for display
PRINT '========== The FORMAT() Method is WAY, WAY SLOWER ==========';
SET STATISTICS TIME ON
;
SELECT @Bitbucket = FORMAT(SomeNumber,'G')
FROM #MyHead
;
--===== Turn the "clock" off
SET STATISTICS TIME OFF
;
Got these timing results (on my somewhat more modern laptop):
(1000000 row(s) affected)
========== Cast, Concatenate, and Size ==========
SQL Server Execution Times:
CPU time = 328 ms, elapsed time = 327 ms.
========== The STR() Method is SLOWER ==========
SQL Server Execution Times:
CPU time = 842 ms, elapsed time = 846 ms.
========== The FORMAT() Method is WAY, WAY SLOWER ==========
SQL Server Execution Times:
CPU time = 35038 ms, elapsed time = 40613 ms.
Aaack! :w00t: Why am I not particularly surprised?
Ok... that tears it. :blush: I have to get off my dead hinny and install 2K12. Thanks to folks like you, Wayne Sheffield, and a couple of others, I not falling behind in knowledge of the version but I really need to start playing with it because of things like the test you did above.
Considering the comparative slowness of some of the "new" Windowing functions compared to things like the Quirky Update that you and Wayne exposed, I'm not surprised at your findings for FORMAT.
I do have a suggestion though. As I pointed out in the article at http://www.sqlservercentral.com/articles/T-SQL/91724/, SET STATISTICS can have some really undesirable impact on what gets reported for performance. Would you try the same test using SQL Profiler and see if you get similar results?
Thanks for that and thanks for the feedback.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 3, 2014 at 11:38 am
Thank you for your gracious response!
With respect to formatted Date tables, I think a lot of data warehousing folks would beg to differ. People like Ralph Kimball advocate using a date table to (a) simplify query-writing and (b) to reduce computational overhead when running queries and generating reports.
Viewing 15 posts - 46 through 60 (of 80 total)
You must be logged in to reply to this topic. Login to reply