May 30, 2007 at 3:10 pm
1. "I did it without % because I couldn't find the mod function by name in Books Online" - because % is an operator, not a function
2. "IF (CHARINDEX('.',@DivResultBy3) = 0" - not so good because
a) in general case, you can not rely on "." as default decimal point (you are lucky that SQL Server currently do not have locale settings for numeric format)
b) it is not scalable:
declare
@a float
set
@a = 3900003
select
Result:
2 1300001
3. Using CHARINDEX 4 times (and you already used 2 variables!) looks strange
4. Unfortunately, it is just another confirmation for Sergiy's point about certification I believe, something should be done about it.
May 30, 2007 at 8:03 pm
I did read that post... pretty well thought out and glad to be part of those thoughts. You covered a good number of the points that I was trying to make in a politically correct manner which, I've found, I'm pretty bad at.
So far as the VARCHAR(8/10) thing... try setting it back to 10 and run 4 or 5 times discounting the first run (cache not necessarily in effect on first run)... then set it to 8 and run 4 or 5 times also discounting the first run... on my box, the times are identical. Appreciate the feedback, though. Lemme know how it works out on yours.
The best point of all was your comment about talking with the customer...that's very important especially during the bidding process... if you let them know, up front, that you intend to turn over scalable, high performance, well documented code (as compared to what they may have gotten used to from your competition ), they'll eat right out of your hands and they'll always come back for more.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 30, 2007 at 9:01 pm
My laptop is a 1.8 Ghz single AMD Turion 64 bit CPU (though with 32 version of Vista) with 2 GB of RAM on SQL Server Express 2005 SP2
My code was exactly as you printed above ..... except I realize that my code was doing 100,000 not 1,000,000.
There is a significant jump in performance when going from 1K to 1M records, but the client statistics shows that it is all in receiving the data from the server, not in processing the query.
Time Statistics
Client processing time 28498
Total execution time 28498
Wait time on server replies 0
May 30, 2007 at 11:05 pm
It runs 9 sec on mine. I have Intel Dual Core 2.16 Ghz but 4GB of RAM. Interesting enough, the fastest set based solution runs 15 sec.
May 30, 2007 at 11:34 pm
Robert, to exclude any doubts do SELECT ... INTO #SomeHashTable.
No time on passing data spent in this case.
Igor, you did not mentioned which year technology used for you HDD.
I guess it's low RPM single HDD, almost the same as they used 10 years ago. Right?
And you definitely did not compare performance for 10 queries from 10 users running simultaneously.
_____________
Code for TallyGenerator
May 30, 2007 at 11:57 pm
Sergiy, Single HDD but 7200 rpm. And I've used INTO #tempTable for set-based solution. Sure, I didn't. But, anyway, the point is that for some hardware configuration in-memory loop-based solution could work faster.
May 31, 2007 at 1:49 am
I know.
I mentioned in this thread that my split function using loop was 10-15% faster than Jeff's option using table Numbers.
But it's still - single user approach.
Remember: when you run loops in application you put the load on client's machine (or one of web-server's machines).
When you run loops in SP you put the load on the same server as all other clients use.
I know the company which moved their data-managing code from VB to SP without revising approaches used in this code. Just transferred VB loops to cursors or WHILE loops.
Result? Server just laid down on the floor and could not bring anything up despite permanent 100% CPU load.
And, of course, they tested everything before moving to production. Everything worked perfect. While it was single test user.
_____________
Code for TallyGenerator
May 31, 2007 at 3:44 am
SELECT * FROM FizzBuzz
N 56°04'39.16"
E 12°55'05.25"
May 31, 2007 at 4:02 am
May 31, 2007 at 5:13 am
Heh... that's funny... but I get what Peter is suggesting with that simple query... pass the data to the client box and let the client box work on the solution. I think that may cause some I/O problems if a lot of people do it, but I'm not an I/O guy... would it be a better way to pass a million rows to the client in a multi-user environment or do the calcs on the server and pass just the result?
I think doing the calcs on the server and passing just the result would be better, overall, but I certainly could be wrong.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 31, 2007 at 5:29 am
As one guy said (don't remember who and where ) "It depends"...
_____________
Code for TallyGenerator
May 31, 2007 at 7:53 am
"would it be a better way to pass a million rows to the client in a multi-user environment or do the calcs on the server and pass just the result?"
For this particular problem I don't think it would make a significant difference in terms of I/O, since in either case you're passing a million rows back to the client. If you were aggregating all the rows, or just passing a subset of the rows back (i.e., just the rows marked "Bizz" and "Buzz" for instance), there would be an I/O improvement by eliminating a bunch of rows on the server before passing them back.
As for the efficiency of processing the rows client-side, I'm not sure. It would be an interesting test though.
May 31, 2007 at 8:02 am
Hello Igor,
Thanks for your reply. Here are my replies.
1. "I did it without % because I couldn't find the mod function by name in Books Online" - because % is an operator, not a function
1. OK, true, mod is an operator. That still doesn't explain why I couldn't find mod by searching for '%' or 'mod'. Those search terms don't assume that mod is a function, yet Books Online did not find it. By the way, even if you search for 'operator' in Books Online, the list that comes back has lots of other operators (- (Negative), !< (Not Less Than), etc.) but not % - nor does it list 'mod' or 'modulo'. I'm not using that as an excuse for not knowing that % is an operator and not a function, but I knew what to look for to accomplish the job - I knew that SQL must have something like mod to get the remainder of the division - and the search functionality failed me. Does that mean I should not be certified for SQL Server? Maybe, I don't know.
2. "IF (CHARINDEX('.',@DivResultBy3) = 0" - not so good because
a) in general case, you can not rely on "." as default decimal point (you are lucky that SQL Server currently do not have locale settings for numeric format)
2. a) I knew that CHARINDEX was not so good, but because I could not find the mod operator syntax and did not want to peek ahead in the article or go to Google, I did the best I could. OK, I can't write a mod operator from scratch in under 10 minutes. Maybe that also means I should not be SQL certified - again, I can't say. I see certification as a first step towards learning more and learning the right way, not as a sign that I have Books Online memorized. That is why I like to come to SQL Server Central and learn stuff like what Jeff presented.
b) it is not scalable:
declare @a float
set @a = 3900003
select CHARINDEX('.',@a/3), @a/3
Result:
2 1300001
b) When I ran this in SQL Server 2000 I got
2 1300001.0
b) I'm not sure what you mean that it is not scalable. I knew when I had to resort to the CHARINDEX approach that it was extremely inefficient code (why I ended up using CHARINDEX is explained above). But I am not clear on what you mean by scalability in the above context. Could you elaborate? Do you mean that it sometimes fails - returns a match for '.' for numbers divisible by 3, 5 or 15? If that is the case, that is not scalability, that sounds like a bug! (And thanks for catching it....)
3. Using CHARINDEX 4 times (and you already used 2 variables!) looks strange
3. Again, see above for why I used CHARINDEX.
4. Unfortunately, it is just another confirmation for Sergiy's point about certification [Sad] I believe, something should be done about it.
4. Perhaps, but as I said, to me certification was a starting point. I can't say that they should not make certification harder, but the fact is that I earned mine and I am happy to have done so.
Thanks,
webrunner
---
webrunner
-------------------
A SQL query walks into a bar and sees two tables. He walks up to them and asks, "Can I join you?"
Ref.: http://tkyte.blogspot.com/2009/02/sql-joke.html
May 31, 2007 at 8:33 am
No, need. I'm quite aware that the client processing time is due to processing 1000000 recordsets of 1 record instead of 1 recordset of 1000000 records.
Ironically, I tested it with nocount on also, and it took longer to process. Someone explain that one.
Out of curiosity, because I had not seen any solution similar to it, I also tested the following expecting poor results, and I got it. Results were very poor because of the looping.
Declare @Counter int
Declare @Output Table (OutputID int identity(1, 1) not null primary key,
OutputText As Case When OutputID % 15 = 0 Then 'BizzBuzz'
When OutputID % 5 = 0 Then 'Buzz'
When OutputID % 3 = 0 Then 'Bizz'
Else Cast(OutputID as varchar(10)) End)
Set @Counter = 1
While @Counter <= 100
Begin
Insert Into @Output Default Values
Set @Counter = @Counter + 1
End
Select OutputText
From @Output
May 31, 2007 at 8:36 am
Hello Jeff,
Thanks for your reply and thanks again for the excellent example you provided. I will run the tests you propose and let you know the outcome.
webrunner
---
webrunner
-------------------
A SQL query walks into a bar and sees two tables. He walks up to them and asks, "Can I join you?"
Ref.: http://tkyte.blogspot.com/2009/02/sql-joke.html
Viewing 15 posts - 106 through 120 (of 309 total)
You must be logged in to reply to this topic. Login to reply