February 24, 2010 at 11:29 am
Lynn Pettis (2/24/2010)
reidres (2/23/2010)
Are my modifications fair?My problem with comparing the two solutions above is you are actually comparing apples to oranges. Jason's routine is returning the results directly where as mine is writing them to disk (#TempTable).
Well we are looking for apples, not oranges. I did the usual modification plus got rid of the uncalled for throwing into a temp table. And I would gig you for including an unrequested N field in your code. Time comes in @ 1:00 for 150 000 000.
February 24, 2010 at 11:36 am
reidres (2/24/2010)
Lynn Pettis (2/24/2010)
reidres (2/23/2010)
Are my modifications fair?My problem with comparing the two solutions above is you are actually comparing apples to oranges. Jason's routine is returning the results directly where as mine is writing them to disk (#TempTable).
Well we are looking for apples, not oranges. I did the usual modification plus got rid of the uncalled for throwing into a temp table. And I would gig you for including an unrequested N field in your code. Time comes in @ 1:00 for 150 000 000.
The N being written to the table allows you to retreive the data in the proper order by including the necessary ORDER BY clause on the query. With out that, you can't guarantee the data is in the proper order. If you run the code so it does a direct output, the N is not required as the order by in the row_number() over handles that for you.
February 24, 2010 at 11:38 am
tstaker (2/24/2010)
Lynn Pettis (2/24/2010)
I have to disagree here. This is a good way to see how some thinks. Do they write a cursor or while loop to complete the task or do they write something set based. It is also an opportunity to question some on why they chose their solution.
Before I learned about Tally tables, I would probably written a cursor or while loop to solve the problem.
I've only used Tally tables when running through the articles on this site and haven't had an opportunity to use them in production. In an interview I would of used the while loop then explained to the interviewer that t-sql might not be the best approach. I might also explain how to do it with a cte and also the tally table concept.
We actually do have a tally table being used in production here, withing an SSIS package to accomplish string parsing. Works wicked fast.
February 24, 2010 at 11:44 am
String parsing like cleanup? Or extracting elements?
Just started playing with xQuery, just curious how it would compare (now I have to go find out) [rant] who stores xmldata without root elements, within a text field, and then expects people to report on it? [/rant]
---------------------------------------------------------
How best to post your question[/url]
How to post performance problems[/url]
Tally Table:What it is and how it replaces a loop[/url]
"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
February 24, 2010 at 11:46 am
jcrawf02 (2/24/2010)
String parsing like cleanup? Or extracting elements?Just started playing with xQuery, just curious how it would compare (now I have to go find out) [rant] who stores xmldata without root elements, within a text field, and then expects people to report on it? [/rant]
Do you really want an answer to that? We could give you a few ideas :Whistling:
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
February 24, 2010 at 11:50 am
CirquedeSQLeil (2/24/2010)
jcrawf02 (2/24/2010)
String parsing like cleanup? Or extracting elements?Just started playing with xQuery, just curious how it would compare (now I have to go find out) [rant] who stores xmldata without root elements, within a text field, and then expects people to report on it? [/rant]
Do you really want an answer to that? We could give you a few ideas :Whistling:
No, it's a sheep.
---------------------------------------------------------
How best to post your question[/url]
How to post performance problems[/url]
Tally Table:What it is and how it replaces a loop[/url]
"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
February 24, 2010 at 11:57 am
jcrawf02 (2/24/2010)
CirquedeSQLeil (2/24/2010)
jcrawf02 (2/24/2010)
String parsing like cleanup? Or extracting elements?Just started playing with xQuery, just curious how it would compare (now I have to go find out) [rant] who stores xmldata without root elements, within a text field, and then expects people to report on it? [/rant]
Do you really want an answer to that? We could give you a few ideas :Whistling:
No, it's a sheep.
Why is XML the basis of so many IT horror stories???
I think it is because it is good for its intended purpose so many people try to apply it to all problems.
Like Jeff trying to put his pork chops onto a Tally Table!!!
Gaz
-- Stop your grinnin' and drop your linen...they're everywhere!!!
February 24, 2010 at 12:02 pm
Actually using it to parse HTML, iirc. I didn't write the code, I shared my knowledge of the Tally table with a co-worker who implemented it.
February 24, 2010 at 12:08 pm
tstaker (2/24/2010)
I've only used Tally tables when running through the articles on this site and haven't had an opportunity to use them in production. In an interview I would of used the while loop then explained to the interviewer that t-sql might not be the best approach. I might also explain how to do it with a cte and also the tally table concept.
I've used then in production systems where response time is important - in one case getting the response at the GUI down from 0.5 seconds average to 0.3 seconds average for one particular operation, at the same time as bringing down the relative variance (=V/(M*M)) of that response time was pretty useful, in others the results were less spectacular but improved scalability so that for some tenders we could quote significantly less hardware cost than without tally tables.
Tom
February 24, 2010 at 12:12 pm
The N being written to the table allows you to retreive the data in the proper order by including the necessary ORDER BY clause on the query. With out that, you can't guarantee the data is in the proper order. If you run the code so it does a direct output, the N is not required as the order by in the row_number() over handles that for you.
Based on the specs as stated there is every reason to run it as direct output unless you can do it faster some other way. If it doesn't make it faster or more scalable, any reason for doing anything that takes away from the goal of speed and scalability should be vigorously set aside and said thing not be done.
February 24, 2010 at 12:22 pm
reidres (2/24/2010)
The N being written to the table allows you to retreive the data in the proper order by including the necessary ORDER BY clause on the query. With out that, you can't guarantee the data is in the proper order. If you run the code so it does a direct output, the N is not required as the order by in the row_number() over handles that for you.
Based on the specs as stated there is every reason to run it as direct output unless you can do it faster some other way. If it doesn't make it faster or more scalable, any reason for doing anything that takes away from the goal of speed and scalability should be vigorously set aside and said thing not be done.
Actually, the spec isn't specific in that regard, it just says to generate the output. I chose to put it into a temp table such that the "display" wasn't the great equalizer.
I have run the code provided so far with direct output, and that has been the case, displaying the output as it is run makes all the routines nominally the same. No one routine seems to win every time. Of course, ymmv.
February 24, 2010 at 1:40 pm
reidres (2/24/2010)
Well we are looking for apples, not oranges.
I'm slightly concerned about the apples and oranges issue: is what you are measuring just generating a cte with the numbers in and then counting them?
If so, I'm a bit surprised by the results you have quoted: I modified Lynn's version to just count instead of copying to a temp table, and on my machine that does 150,000,000 in roughly a quarter of the time Jason's takes to do its maximum 100,000,000. I'm wondering just what it is that I've got horribly wrong in copying the code, or maybe I've copied the wrong version from the wrong place.
Anyway, just for fun, here's an entry from me (based on the assumption that we are just generating the CTE and counting it). On my machine it's about 7% faster than Lynn's - assuming I'm measuring the right thing (maybe only because my machine is on my side - it's a very faithful old beast:hehe:).
use play -- choose a database to work in
go
declare @Limit bigint;
set @Limit = 150000000;
set statistics time on;
with some_columns as -- just as many as we need to get to the limit (cube root)
-- 1+POWER because @Limit is an int so POWER will return an int and may round down :(
(SELECT top (1+power(@Limit,1.0/3.0)) 0 as Z from Master.sys.all_columns),
numbers (J) as
(SELECT top (@Limit) ROW_NUMBER() OVER (ORDER BY AC1.Z) AS J
from some_columns AC1
CROSS JOIN some_columns AC2
CROSS JOIN some_columns AC3
)
select COUNT(*) from numbers
set statistics time off
go
Tom
February 24, 2010 at 1:49 pm
Tom.Thomson (2/24/2010)
reidres (2/24/2010)
Well we are looking for apples, not oranges.I'm slightly concerned about the apples and oranges issue: is what you are measuring just generating a cte with the numbers in and then counting them?
If so, I'm a bit surprised by the results you have quoted: I modified Lynn's version to just count instead of copying to a temp table, and on my machine that does 150,000,000 in roughly a quarter of the time Jason's takes to do its maximum 100,000,000. I'm wondering just what it is that I've got horribly wrong in copying the code, or maybe I've copied the wrong version from the wrong place.
Anyway, just for fun, here's an entry from me (based on the assumption that we are just generating the CTE and counting it). On my machine it's about 7% faster than Lynn's - assuming I'm measuring the right thing (maybe only because my machine is on my side - it's a very faithful old beast:hehe:).
use play -- choose a database to work in
go
declare @Limit bigint;
set @Limit = 150000000;
set statistics time on;
with some_columns as -- just as many as we need to get to the limit (cube root)
-- 1+POWER because @Limit is an int so POWER will return an int and may round down :(
(SELECT top (1+power(@Limit,1.0/3.0)) 0 as Z from Master.sys.all_columns),
numbers (J) as
(SELECT top (@Limit) ROW_NUMBER() OVER (ORDER BY AC1.Z) AS J
from some_columns AC1
CROSS JOIN some_columns AC2
CROSS JOIN some_columns AC3
)
select COUNT(*) from numbers
set statistics time off
go
Don't know, Tom, on the server I am using here at work yours actually runs about 4 time longer than my code, about 4 seconds.
Really comes down to "It Depends".
February 24, 2010 at 1:58 pm
There was an update to the code - I didn't post it to the contest just in this thread.
That is interesting that it took that much longer for it to run on your system than Lynn's. Lynn and I have been going back and forth and didn't see any results quite like that.
OIC, As just a count, that is quite possible. We hadn't been testing doing just a count. We were testing the total time to display the entire list.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
February 24, 2010 at 2:17 pm
Interesting Tom. I will study this. On my machine it comes up @ 0:07 (seven seconds!)compared to Lynn's 1:03 (1 minute) w/out a temp table and 2:47 w/ a temp table.
The idea of my tests is that unless you are doing something magic with converting the relevant mod numbers, the assumption is that the generation of the numbers will be the limiting factor when it comes to speed. But that is subject to revision if information reveals it to be a faulty assumption.
Viewing 15 posts - 181 through 195 (of 363 total)
You must be logged in to reply to this topic. Login to reply