April 28, 2009 at 4:31 am
min.li (4/28/2009)
Hello,I recently had to use curosr to do a function in my work. Although I knew I should try to avoid cursor, but could not find a way. I wonder if you can figure out doing it without cursor.
I have a table like this:
Name RowNumber
RL 1
RL 2
..
RL 10
SL 11
SL
.. 15
RL 16
..
RL 30
SL 31
.. 41
The RowNumber is granteed to be in sequence and there are actually more than two values for the 'Name' column, only two listed here for simplicity.
The result of the function I need is a return table varaible like
Name startRowNumber endRowNumber
RL 1 10
SL 11 15
RL 16 30
SL 31 41
I am really curious if there is a way to achieve it without using cursor.
Thanks.
Your case is hard to folllow, can you at least make a test script with some data in SQL form so we use that and don't each have to make our own? This will greatly increase the response to your particular issue!
April 28, 2009 at 5:14 am
Hi,
Firstly, test table can be prepared as:
IF OBJECT_ID('dbo.TestData') IS NOT NULL
DROP TABLE dbo.TestData
CREATE TABLE dbo.TestData
(Name varchar(4) not null,
RowNumber int not null
)
DECLARE @Counter int
SET @Counter = 1
WHILE @Counter <= 100
BEGIN
IF ((@Counter/10)%2=0)
INSERT INTO dbo.TestData(Name, RowNumber) VALUES ('RL', @Counter)
ELSE
INSERT INTO dbo.TestData(Name, RowNumber) VALUES ('SL', @Counter)
SET @Counter = @Counter + 1
END
Then, the cursor function is:
CREATE FUNCTION [dbo].[TestNumberRange]()
RETURNS @DataDistribute TABLE (
Name varchar(4)not null,
StartNum intnot null,
EndNum intnot null) AS
BEGIN
declare @name varchar(4),
@preName varchar(4),
@rowNumber int,
@startNumber int,
@endNumber int
declare data_cursor cursor for
select Name, RowNumber from TestData order by RowNumber
set @preName= 'None'
set @endNumber = 0
open data_cursor
fetch next from data_cursor into @name, @rowNumber
while (@@fetch_status = 0)
begin
if (@name != @preName)
begin
if @endNumber != 0
insert into @DataDistribute
values (@preName, @startNumber, @endNumber)
set @startNumber = @rowNumber
set @preName = @name
end
set @endNumber =@rowNumber
fetch next from data_cursorinto @name, @rowNumber
end
insert into @DataDistribute values (@name, @startNumber, @endNumber)
close data_cursor
deallocate data_cursor
RETURN
END
Lastly, function can be tested as:
select * from dbo.TestNumberRange()
Can it be done without cursor?
April 28, 2009 at 5:35 am
min.li (4/28/2009)
Hello,I recently had to use curosr to do a function in my work. Although I knew I should try to avoid cursor, but could not find a way. I wonder if you can figure out doing it without cursor.
I have a table like this:
Name RowNumber
RL 1
RL 2
..
RL 10
SL 11
SL
.. 15
RL 16
..
RL 30
SL 31
.. 41
The RowNumber is granteed to be in sequence and there are actually more than two values for the 'Name' column, only two listed here for simplicity.
The result of the function I need is a return table varaible like
Name startRowNumber endRowNumber
RL 1 10
SL 11 15
RL 16 30
SL 31 41
I am really curious if there is a way to achieve it without using cursor.
Thanks.
try this - i created a simple table to test it on
create table temp1 (col1 varchar(2), rownum int)
insert into temp1 values ('RL', 1)
insert into temp1 values ('RL', 2)
insert into temp1 values ('RL', 3)
insert into temp1 values ('RL', 4)
insert into temp1 values ('SL', 5)
insert into temp1 values ('SL', 6)
insert into temp1 values ('SL', 7)
insert into temp1 values ('SL', 8)
insert into temp1 values ('RL', 9)
insert into temp1 values ('RL', 10)
insert into temp1 values ('RL', 11)
insert into temp1 values ('RL', 12)
insert into temp1 values ('SL', 13)
insert into temp1 values ('SL', 14)
insert into temp1 values ('SL', 15)
insert into temp1 values ('SL', 16)
select D1.col1, d2.start, d1.[end]
from (
select t1.col1, t1.rownum as [end], row_number () over (order by t1.col1, t1.rownum) as rn
from temp1 t1
left outer join temp1 t2
on t1.rownum + 1 = t2.rownum
or t2.rownum is null
where t1.col1 t2.col1 or t2.col1 is null) D1
inner join
(select t1.col1, t1.rownum as [start], row_number () over (order by t1.col1, t1.rownum) as rn
from temp1 t1
left outer join temp1 t2
on t1.rownum - 1 = t2.rownum
where t1.col1 t2.col1 or t2.col1 is null) D2
on D1.col1 = D2.col1
and d1.rn = d2.rn
order by d2.start
-- EDITED - just tested it with a 2000 row sample and noticed an error. I fixed the SQL in this posted
April 28, 2009 at 5:36 am
Andrew L. Smith (4/27/2009)
Barry:I agree with this for the most part, but have an exception that I would like to see if you have an answer for. For example, I need to receive a number of records from a Message Queue and perform (or not perform) a stored procedure, passing information from the record read to the stored procedure. I am not updating anything and I do not need the combined results of the cursor. Can you suggest a way to accomplish this without a cursor? (I am actually passing the data to a stored procedure that e-mails me an alert that certain events have occured on the database server.)
Regards,
Andrew Smith, SQL Server DBA
Heh. Message Broker is actually a very special and most difficult case because of the incompleteness of the final implementation it is both very set-orianted but also very antagonistic towrds sat-based usage. As such I will propably not be getting to it until that later installements (this also affects one of the most asked questions in the seires, but not so obviously).
However, the answer to this question is easy, so I will address it now: 1) service your queue only with an activation stored procedure that only receives 1 row at a time. 2) for greater performance, set your activation count higher.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
April 28, 2009 at 5:54 am
Thank you min.li, I ran the test code and that works and is a good starting point. I will take a deeper look at the actual code later today, problably in the evening as I got to work atm. I feel confident your cursor can be fully eliminated.
I will be working on SQL Server 2005, If you use a older verson let usknow that as well.
April 28, 2009 at 5:56 am
peter (4/28/2009)
I will be working on SQL Server 2005, If you use a older verson let usknow that as well.
Good point... the solution I posted is 2005/2008 specific. Probably could be altered to suit SQL Server 2000 though
April 28, 2009 at 5:58 am
Hi,
Thanks for all reply. I am working on SQL server 2005. However, Samuel's solution does not work on my SQL server, which returns following error message:
Msg 4104, Level 16, State 1, Line 1
The multi-part identifier "d1.rn" could not be bound.
Msg 4104, Level 16, State 1, Line 1
The multi-part identifier "d2.rn" could not be bound.
Msg 4104, Level 16, State 1, Line 1
The multi-part identifier "d2.start" could not be bound.
Msg 4104, Level 16, State 1, Line 1
The multi-part identifier "d1.end" could not be bound.
Msg 4104, Level 16, State 1, Line 1
The multi-part identifier "d2.start" could not be bound.
April 28, 2009 at 6:05 am
min.li (4/28/2009)
Hi,Thanks for all reply. I am working on SQL server 2005. However, Samuel's solution does not work on my SQL server, which returns following error message:
Msg 4104, Level 16, State 1, Line 1
The multi-part identifier "d1.rn" could not be bound.
Msg 4104, Level 16, State 1, Line 1
The multi-part identifier "d2.rn" could not be bound.
Msg 4104, Level 16, State 1, Line 1
The multi-part identifier "d2.start" could not be bound.
Msg 4104, Level 16, State 1, Line 1
The multi-part identifier "d1.end" could not be bound.
Msg 4104, Level 16, State 1, Line 1
The multi-part identifier "d2.start" could not be bound.
weird... I just copied the post directly from here into a new SQL Server Query window and it ran without any errors
Can you just make sure you copied it properly and didn't change anything
UPDATE
Just realised what the problem is --- you're using a case sensitive server and I (in my lazyness) wrote case INsensitive code
DROP TABLE TEMP1
CREATE TABLE TEMP1 (COL1 VARCHAR(2), ROWNUM INT)
INSERT INTO TEMP1 VALUES ('RL', 1)
INSERT INTO TEMP1 VALUES ('RL', 2)
INSERT INTO TEMP1 VALUES ('RL', 3)
INSERT INTO TEMP1 VALUES ('RL', 4)
INSERT INTO TEMP1 VALUES ('SL', 5)
INSERT INTO TEMP1 VALUES ('SL', 6)
INSERT INTO TEMP1 VALUES ('SL', 7)
INSERT INTO TEMP1 VALUES ('SL', 8)
INSERT INTO TEMP1 VALUES ('RL', 9)
INSERT INTO TEMP1 VALUES ('RL', 10)
INSERT INTO TEMP1 VALUES ('RL', 11)
INSERT INTO TEMP1 VALUES ('RL', 12)
INSERT INTO TEMP1 VALUES ('SL', 13)
INSERT INTO TEMP1 VALUES ('SL', 14)
INSERT INTO TEMP1 VALUES ('SL', 15)
INSERT INTO TEMP1 VALUES ('SL', 16)
SELECT D1.COL1, D2.START, D1.[END]
FROM (
SELECT T1.COL1, T1.ROWNUM AS [END], ROW_NUMBER () OVER (ORDER BY T1.COL1, T1.ROWNUM) AS RN
FROM TEMP1 T1
LEFT OUTER JOIN TEMP1 T2
ON T1.ROWNUM + 1 = T2.ROWNUM
OR T2.ROWNUM IS NULL
WHERE T1.COL1 T2.COL1 OR T2.COL1 IS NULL) D1
INNER JOIN
(SELECT T1.COL1, T1.ROWNUM AS [START], ROW_NUMBER () OVER (ORDER BY T1.COL1, T1.ROWNUM) AS RN
FROM TEMP1 T1
LEFT OUTER JOIN TEMP1 T2
ON T1.ROWNUM - 1 = T2.ROWNUM
WHERE T1.COL1 T2.COL1 OR T2.COL1 IS NULL) D2
ON D1.COL1 = D2.COL1
AND D1.RN = D2.RN
ORDER BY D2.START
April 28, 2009 at 6:56 am
Thanks Samuel, It works now. Silly me, I should figure out myself.
April 28, 2009 at 7:05 am
Back min.li ( I couldn't resist the challenge)
First I modified your test case in one important way!
alter table dbo.TestData add constraint pk_TestData primary key clustered ( RowNumber );
Adding a clustered index on the rownumber seemed fair as you said the order is enforced and implied it is not broken up by anything. The effect of the index kicks in as the source table increases in the number of records it contains.
Here is what I came up with....I would not necessarily consider this full set based code, but I did eliminate the cursor and it scales far better then the original solution that used a cursor.
create function dbo.TestNumberRange_nocursor1() returns table
as
return
(
with
lowerEdgesQ( SegmentID, Name, RowNumber ) as
(
select
row_number() over ( order by d.RowNumber asc )
, d.Name
, d.RowNumber
from
dbo.TestData as d
where
-- detects lower edge of each name
not exists( select 1 from dbo.TestData as i where i.Name = d.Name and i.RowNumber = d.RowNumber - 1 )
)
, upperEdgesQ( SegmentID, Name, RowNumber ) as
(
select
row_number() over ( order by d.RowNumber asc )
, d.Name
, d.RowNumber
from
dbo.TestData as d
where
-- detects upper edge of each name
not exists( select 1 from dbo.TestData as i where i.Name = d.Name and i.RowNumber = d.RowNumber + 1 )
)
select
lowerEdgesQ.Name
, lowerEdgesQ.RowNumber as StartNum
, upperEdgesQ.RowNumber as EndNum
from
lowerEdgesQ
inner join upperEdgesQ on upperEdgesQ.SegmentID = lowerEdgesQ.SegmentID
)
;
For testing I used the timing code provided in the article and I included the solution of Samuel Vella as well.
First I ran against the 100 records you put initially into the source table, results are:
/* cursor original by min.li */
CpuMs LogRds Elapsed
----------- -------------------- -----------
0 425 3
/* cursor free by peter */
CpuMs LogRds Elapsed
----------- -------------------- -----------
0 404 1
/* cursor free by Samuel Vella */
CpuMs LogRds Elapsed
----------- -------------------- -----------
0 404 2
Then I ran another test with 10000 records to showcase scaling, results are:
/* cursor original by min.li */
CpuMs LogRds Elapsed
----------- -------------------- -----------
250 41083 249
/* cursor free by peter */
CpuMs LogRds Elapsed
----------- -------------------- -----------
31 104 41
/* cursor free by Samuel Vella */
CpuMs LogRds Elapsed
----------- -------------------- -----------
31 292 128
As you can see, both the cursor free versions scale much better. The one I supplied while not pure set based yet, is not hard to understand and the function is inline. The later meaning that the optimizer can make it part of the main query and cut out any overhead it detects before execution!
April 28, 2009 at 7:23 am
Thanks peter, this works perfectly.
April 28, 2009 at 7:39 am
I did a quick optization of your own code which is interesting to know in case the time to alter existing code is limited!
I changed the declaration of the cursor as:
declare data_cursor cursor local fast_forward read_only for
select Name, RowNumber from TestData order by RowNumber
This restricts the cursor a little but suits your use just fine and is less resource hungry. This directly shows up in the following timings:
For 100 records in the source table:
/* cursor original by min.li */
CpuMs LogRds Elapsed
----------- -------------------- -----------
0 425 3
0 122 1 /* cursor local fast_forward read_only for */
/* cursor free by peter */
CpuMs LogRds Elapsed
----------- -------------------- -----------
0 404 1
/* cursor free by Samuel Vella */
CpuMs LogRds Elapsed
----------- -------------------- -----------
0 404 2
For 10000 records in the source table:
/* cursor original by min.li */
CpuMs LogRds Elapsed
----------- -------------------- -----------
250 41083 249
125 11080 136 /* cursor local fast_forward read_only for */
/* cursor free by peter */
CpuMs LogRds Elapsed
----------- -------------------- -----------
31 104 41
/* cursor free by Samuel Vella */
CpuMs LogRds Elapsed
----------- -------------------- -----------
31 292 128
You see that your original with a modified cursor declaration is best performing on small sets in this case. But don't get ahead of yourself and put cursors everywhere 😉 as there is a BIG catch. They don't scale and they cannot be used in inline table valued functions and thus cannot be part of the optimization process when part of a larger query. This means that the optimized cursor version still ends up loosing in cases where we do not directly display the function results.
April 28, 2009 at 7:42 am
peter (4/28/2009)
Here is what I came up with....I would not neccecarily consider this full set based code, but I did eliminate the cursor and it scales far better then the original solution that used a cursor.
Interesting results Peter 🙂
What intrigued me was that both our solutions are effectively the same however yours was significantly more efficient than my own
There was one key difference between our solutions - we took different approaches to the problem of when to detect the change from one row name to the next.
I had done mine through a self join and subtracting (or adding) 1 to the row number to define the join and when there was no join then that was the row to pick.
you had used a subquery instead of a self join. I always work without subqueries whenever possible on the general assumption that they are less efficient
However, converting your function into a query and then renaming the table and column names to use the test table I built (no indexes on it) and then comparing execution times on my own unindexed table confirmed that in this case the sub query is quicker.
Completely counter to how I would normally tackle the problem.
I finally learnt something today 😀
April 28, 2009 at 8:01 am
shashank (4/28/2009)
I've recently used CURSOR in one of my application (unwishingly though). We were working on a survey application and we collected the data of our web forms in our custom entity objects and finally xml serialized them on a button's click and passed the entire xml to a stored procedure which traversed and read the xml and inserted the data into the actual database. Now the structure of the xml was something like this<Questions>
<Question type="1">
<text>how are you</text>
<cretionDate>2008-12-3</cretionDate>
<QuestionDetails>
some more nodes........
</QuestionDetails>
</Question>
<Question type="2">
<textvFeeling Good</text>
<cretionDate>2008-12-3</cretionDate>
<QuestionDetails>
some more nodes........
</QuestionDetails>
</Question>
</Questions>
How could i prevent the use of cursor???
Obviously, nobody can give a complete answer without more specific requirements, as well as sample data.
But, if you use the XML data type you can query the XML using the nodes(), and value() functions.
April 28, 2009 at 8:05 am
Thanks Samuel and Peter.
The real environment where this function to be used, the 'TestData' table is not really large. Or to be exact, the TestData table is large, but this function only work on a subset of the TestData table each time in sequence, the size of the subset is defined by a pass in parameter 'PageSize', which will not be over 1000 maximum. In this context, I think the cursor solution is still suitable, as scaling is not such important factor to consider. And, I have to say that the cursor solution is a lot easier to read and understand for me as the solutions that you two have offered are quite complex to me at the moment, I will not be able to work it out by myself. So I still think cursor is appropriate in such circumstances.
Viewing 15 posts - 76 through 90 (of 316 total)
You must be logged in to reply to this topic. Login to reply