June 13, 2008 at 11:51 am
I was able to utilize the proc. This is GREAT. The only thing I had to change was to use a coalesce rather than isnull. Below is the syntax:
COALESCE((SELECT COUNT(*) FROM @item_table),0 )
June 13, 2008 at 12:01 pm
Bobbie Shrivastav (6/13/2008)
I was able to utilize the proc. This is GREAT. The only thing I had to change was to use a coalesce rather than isnull. Below is the syntax:COALESCE((SELECT COUNT(*) FROM @item_table),0 )
When have you EVER known COUNT(*) to return a NULL??? The COALESCE/ISNULL is a total waste of clock cycles for this... Here's the proof...
DECLARE @Item_Table TABLE (RowNum INT, SomethingElse VARCHAR(20))
SELECT COUNT(*) FROM @Item_Table
-----------
0
(1 row(s) affected)
--Jeff Moden
Change is inevitable... Change for the better is not.
June 13, 2008 at 12:02 pm
Good point on the SQL 2000 vs SQL 2005. My first attempt was against 2005. I tried again against a sql 2000 database and came out with very similar results. The cursor was much faster than the while loop.
Just to give you an idea of time (on SQL 2000 Enterprise)
100 records
Cursor: 16 ms
While: 16 ms
500 Records
Cursor: 60 ms
While: 123 ms
1000 records
Cursor: 93 ms
While: 390 ms
5000 records
Cursor: 486ms
While: 9123ms
10000 records
Cursor: 986 ms
While: 39046ms
65526 records (entire dataset)
Cursor: 9406
While: 1,200,000 and still counting
If you look at the 10 fold increase of 100->1000->10000, the cursor processing time increases are fairly relative to the amount of data (6x for 100->1000 and 11x from 1000-10000). Even from 100 to the 65526 is a 655x increase in records but only about 590x increase in processing from 100 to 65500. It still seems pretty much relative.
The while loop on the other hand shows more of a geometric pattern where the processing time increases substantially more than the increase in data processed (24x 100->1000, 100x for 1000->10000), 30x+ 10000->65526
My guess this increase in processing is due to a lack of an index on the primary key field for the in memory table.
EDIT:
Just finished the while loop of the entire recordset. 1,494,640ms which is a 38x increase in processing for 6.5 times the data.
June 13, 2008 at 12:08 pm
Chris.Strolia-Davis (6/13/2008)
while (select count(*) from @table) > 0begin
select top 1 'do something', @id=id from @table
delete from @table where id=@id
end
Never use count(*) for this purpose;
every time it is run it scans the entire table!
Use the much, much more efficient EXISTS clause:
while EXISTS (select * from @table)
begin
select top 1 'do something', @id=id from @table
delete from @table where id=@id
end
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
June 13, 2008 at 12:18 pm
Ok, so for the fun of it I tried to add a Primary Key Clustered in my in memory table. The difference is substantial
New times for the while loop:
100 records
Cursor: 16 ms
While: 16 ms
While with Index: 16 ms
500 Records
Cursor: 60 ms
While: 123 ms
While with Index: 46 ms
1000 records
Cursor: 93 ms
While: 390 ms
While with Index: 110 ms
5000 records
Cursor: 486ms
While: 9123ms
While with Index: 533 ms
10000 records
Cursor: 986 ms
While: 39046ms
While with Index: 1080 ms
65526 records (entire dataset)
Cursor: 9406
While: 1,494,640 ms
While with Index: 7106 ms
By having an index on my in memory table, I now see a substantial improvement in the while loop. It is now edging out the cursor on the entire data set. On the other groupings it was much closer between the two. Actually it is at a point where there is really no difference between one or the other.
June 13, 2008 at 12:55 pm
I was getting worried when I got through 5 or 6 pages of this thread and hadn't seen Jeff Moden jumping in to point out that different types of row-by-row processing are still row-by-row. I knew if I hung in there, he would say what I wanted to say better than I would have. And he did!
I gained an understanding of set-based processing working in an array oriented language (APL), and am amazed that (us) programmers in this day and age still put up with scalar one-at-a-time processing. It is sad that when set based solutions are available, folks still revert to old fashioned loops!
(No disrespect for Kamran, who deserves credit for launching this great exchange.)
June 13, 2008 at 1:21 pm
Did anyone run that test code with other than the default cursor settings (which are not optimal)? Can someone try LOCAL STATIC FORWARD_ONLY READ_ONLY and see how the performance is?
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
June 13, 2008 at 1:35 pm
Jim Russell (6/13/2008)
I was getting worried when I got through 5 or 6 pages of this thread and hadn't seen Jeff Moden jumping in to point out that different types of row-by-row processing are still row-by-row. I knew if I hung in there, he would say what I wanted to say better than I would have. And he did!
Heh... glad I didn't let you down, Jim! And, thanks for the compliment! 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
June 13, 2008 at 1:37 pm
Kevin Rathgeber (6/13/2008)
Good point on the SQL 2000 vs SQL 2005. My first attempt was against 2005. I tried again against a sql 2000 database and came out with very similar results. The cursor was much faster than the while loop.Just to give you an idea of time (on SQL 2000 Enterprise)
100 records
Cursor: 16 ms
While: 16 ms
500 Records
Cursor: 60 ms
While: 123 ms
1000 records
Cursor: 93 ms
While: 390 ms
5000 records
Cursor: 486ms
While: 9123ms
10000 records
Cursor: 986 ms
While: 39046ms
65526 records (entire dataset)
Cursor: 9406
While: 1,200,000 and still counting
If you look at the 10 fold increase of 100->1000->10000, the cursor processing time increases are fairly relative to the amount of data (6x for 100->1000 and 11x from 1000-10000). Even from 100 to the 65526 is a 655x increase in records but only about 590x increase in processing from 100 to 65500. It still seems pretty much relative.
The while loop on the other hand shows more of a geometric pattern where the processing time increases substantially more than the increase in data processed (24x 100->1000, 100x for 1000->10000), 30x+ 10000->65526
My guess this increase in processing is due to a lack of an index on the primary key field for the in memory table.
EDIT:
Just finished the while loop of the entire recordset. 1,494,640ms which is a 38x increase in processing for 6.5 times the data.
Kevin,
I lost the bubble a bit here... what is the code you're running and what is the test data? I'd like to do a comparison of my own and want to use apples vs apples, if you know what I mean...
Thanks...
--Jeff Moden
Change is inevitable... Change for the better is not.
June 13, 2008 at 1:38 pm
Kevin Rathgeber (6/13/2008)
Ok, so for the fun of it I tried to add a Primary Key Clustered in my in memory table. The difference is substantialNew times for the while loop:
100 records
Cursor: 16 ms
While: 16 ms
While with Index: 16 ms
500 Records
Cursor: 60 ms
While: 123 ms
While with Index: 46 ms
1000 records
Cursor: 93 ms
While: 390 ms
While with Index: 110 ms
5000 records
Cursor: 486ms
While: 9123ms
While with Index: 533 ms
10000 records
Cursor: 986 ms
While: 39046ms
While with Index: 1080 ms
65526 records (entire dataset)
Cursor: 9406
While: 1,494,640 ms
While with Index: 7106 ms
By having an index on my in memory table, I now see a substantial improvement in the while loop. It is now edging out the cursor on the entire data set. On the other groupings it was much closer between the two. Actually it is at a point where there is really no difference between one or the other.
Careful - when you start posting results like this - you might get asked for some specifics on what you were doing, just so we can compare them to set-based results. 7 seconds on 65K rows is pretty good on human standards, but seems supremely "beatable" with a good set-based option.
Not that I'm actually asking, but you might, which then leads to a test-off, etc.....:w00t:
Edit: I see Jeff has already thrown the gauntlet:)
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
June 13, 2008 at 1:47 pm
Matt Miller (6/13/2008)
Edit: I see Jeff has already thrown the gauntlet:)
Yep... getting ready to launch some pork chops! 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
June 13, 2008 at 1:48 pm
**ducking** I had chicken for lunch....
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
June 13, 2008 at 1:51 pm
Is a 'test-off' like a 'dance-off'? should I go get some cardboard?
---------------------------------------------------------
How best to post your question[/url]
How to post performance problems[/url]
Tally Table:What it is and how it replaces a loop[/url]
"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
June 13, 2008 at 1:53 pm
jcrawf02 (6/13/2008)
Is a 'test-off' like a 'dance-off'? should I go get some cardboard?
Only if you want to save the big pieces... 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
June 13, 2008 at 2:09 pm
Matt I totally understand what you are saying. Actually what I am going to do next is an average of several runs against the data to get a better idea.
Jeff, here is what I am running for both (please excuse the variable names etc....I am just testing and not wanting to make good proper names 🙂
Couple Notes
Now, for my tests of 100, 1000, etc I was selecting top x. This probably isn't optimal so I may want to look at doing ranges of data through a where clause. Also will remove the timers I had in there and use the client statistics
And as I said before, I know there are better ways of summing, I just chose this way as a test scenario as the overhead of that computation is minimal (and use it as a check to ensure that my cursor and while versions are coming back with the same data).
This is also being run on SQL Server 2000 Enterprise.
Cursor Code
DECLARE @FrontageSum float
DECLARE @sid INT, @SASN varchar(10),
@SASPreD varchar(2), @SASName varchar(30),
@SASPD varchar(2), @SAPC varchar(2),
@SF float
DECLARE @starttime datetime, @endtime datetime
SET @starttime = getdate()
DECLARE curSite CURSOR FAST_FORWARD FOR
SELECT Site_id, Site_addr_st_num, Site_addr_st_pre_dir, Site_addr_st_name,
Site_addr_st_post_dir, Site_addr_prov_code, site_frontage
FROM SITE
OPEN curSite
FETCH NEXT FROM curSite INTO
@sid, @SASN, @SASPreD, @SASName, @SASPD, @SAPC, @SF
SET @FrontageSum = 0
WHILE @@FETCH_STATUS = 0
BEGIN
IF @SF IS NOT NULL
SET @FrontageSum = @FrontageSum + @SF
FETCH NEXT FROM curSite INTO
@sid, @SASN, @SASPreD, @SASName, @SASPD, @SAPC, @SF
END
DEALLOCATE curSite
PRINT @FrontageSum
SET @endtime = getdate()
PRINT datediff(ms, @starttime, @endtime)
While Loop
DECLARE @FrontageSum float
DECLARE @sid INT, @SASN varchar(10),
@SASPreD varchar(2), @SASName varchar(30),
@SASPD varchar(2), @SAPC varchar(2),
@SF float
DECLARE @itemcount int
DECLARE @counter int
DECLARE @starttime datetime, @endtime datetime
SET @starttime = getdate()
DECLARE @item_table TABLE
(primary_key INT IDENTITY(1,1) not null PRIMARY KEY CLUSTERED,
SiteID int,
SASN varchar(10),
SASPreD varchar(2),
SASName varchar(30),
SASPD varchar(2),
SAPC varchar(2),
SF float)
INSERT INTO @item_table(SiteID, SASN, SASPreD, SASName, SASPD, SAPC, SF)
SELECT Site_id, Site_addr_st_num, Site_addr_st_pre_dir, Site_addr_st_name,
Site_addr_st_post_dir, Site_addr_prov_code, site_frontage
FROM SITE
set @itemcount = @@rowcount
SET @counter = 1
set @FrontageSum = 0
IF @itemcount > 0
BEGIN
WHILE @counter <= @itemcount
BEGIN
SELECT @sid = SiteID, @SASN = SASN, @SASPreD = SASPreD, @SASName = SASName,
@SASPD = SASPD, @SAPC = SAPC, @SF = SF
FROM @item_table
WHERE primary_key = @counter
IF @SF IS NOT NULL
SET @FrontageSum = @FrontageSum + @SF
SET @counter = @counter + 1
END
END
PRINT @FrontageSum
SET @endtime = getdate()
PRINT datediff(ms, @starttime, @endtime)
I will send you the table layout (hopefully that is enough to get you going)
CREATE TABLE [dbo].[SITE](
[Timestamp] [binary](8) NOT NULL,
[Site_Id] [int] NOT NULL,
[Active_Status] [varchar](8) NOT NULL,
[Site_Addr_St_Num] [varchar](10) NULL,
[Site_Addr_St_Pre_Dir] [varchar](2) NULL,
[Site_Addr_St_Name] [varchar](30) NULL,
[Site_Addr_St_Suff] [varchar](6) NULL,
[Site_Addr_St_Post_Dir] [varchar](2) NULL,
[Site_Addr_City] [varchar](30) NULL,
[Site_Addr_Prov_Code] [varchar](2) NULL,
[Site_Addr_Cntry_Code] [varchar](8) NULL,
[Site_Addr_Postal_Code] [varchar](11) NULL,
[Site_Addr_Subdiv] [varchar](20) NULL,
[Site_Addr_Qtr_Sec] [varchar](2) NULL,
[Site_Addr_Section] [varchar](2) NULL,
[Site_Addr_Township] [varchar](3) NULL,
[Site_Addr_Range] [varchar](2) NULL,
[Site_Addr_Meridian] [varchar](10) NULL,
[Site_Addr_Portion] [varchar](20) NULL,
[Site_Status] [varchar](8) NULL,
[Meter_Route_Id] [int] NULL,
[Site_Area] [float] NOT NULL CONSTRAINT [df_SITE_Site_Area] DEFAULT (0),
[Hospital_Ward_Id] [int] NULL,
[Municipal_Ward_Id] [int] NULL,
[Nghd_Id] [char](3) NULL,
[Nghd_Study_Dsg] [char](4) NULL,
[Dev_Plan_Dsg] [varchar](4) NULL,
[Zoning_Dsg] [varchar](8) NULL,
[Proposed_Dev_Plan_Dsg] [varchar](4) NULL,
[Proposed_Zoning_Dsg] [varchar](9) NULL,
[Plan_Number] [varchar](67) NULL,
[Block_Number] [varchar](19) NULL,
[Lot_Number] [varchar](67) NULL,
[Site_Frontage] [float] NULL,
[Electric_Franchise_Id] [int] NOT NULL,
[Water_Franchise_Id] [int] NOT NULL,
[Electric_Work_Area_Id] [int] NULL,
[Water_Work_Area_Id] [int] NULL,
[Site_Group_Type] [varchar](8) NULL,
[Site_Width_Setback] [float] NULL,
[CSMA_LineWork] [char](1) NULL,
[CSMA_DateTime] [datetime] NULL,
[COA_Reference] [varchar](50) NULL,
[COA_Reference_Number] [int] NULL,
[Municipality_Name] [varchar](30) NULL,
[Site_Addr_Online_Addr] [varchar](255) NULL,
[Site_Addr_Rgstr_Desc] [varchar](255) NULL,
[Create_User_Id] [varchar](255) NOT NULL,
[Create_User_Domain_Name] [varchar](255) NULL,
[Create_Datetime] [datetime] NOT NULL,
[Update_User_Id] [varchar](255) NULL,
[Update_User_Domain_Name] [varchar](255) NULL,
[Update_Datetime] [datetime] NULL
)
Viewing 15 posts - 151 through 165 (of 296 total)
You must be logged in to reply to this topic. Login to reply