Cursor

Question

Cursor

Viewing 15 posts - 16 through 30 (of 49 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004686 More actions · Answer 1

rbarryyoung (8/28/2008)
Jeff Moden (8/28/2008)
It's also part of the reason I wrote that forum ettiquette article on how to provide actual test data so that good folks like yourself don't have to spend so much time creating data and can get right to trying to solve the problem, instead.
Man, I think that I linked to that article like 20 times today. Now I know why you and Jack put it into your signature!

Heh... yeah... I got tired of looking it up. Sure wish Steve could find a way to make it a "sticky" in some of the forums.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

steve dassin SSCrazy Points: 2396 More actions · Answer 2

GSquared (8/28/2008)
The first query breaks them up into sets, based on the SetStart and SetEnd columns and the Date column.
The second query then takes each set and gives it a sequence number within the set.
Is that what you're looking for? Is it clear enough?

Unfortunately this type of problem is not so clear to most users even in its recognition let alone a solution. Sql server doesn't help matters by offering no construct(s) to directly solve the problem. And what is the clearest, simplest way to characterize a problem like this so it becomes easy to communicate about it? This problem, along with relational division, appears to be sqls version of porn. You can't nail down the definition but you know it when you see it:)

These are two pretty good threads about approaching this type of problem. They do contain spoons but forks as well. FWIW I'm dassin, steve.

'searching for the longest subsequence of ones'

http://tinyurl.com/5g2dgq

'Roll Up IP-Location Database'

http://tinyurl.com/6evlbr

http://www.beyondsql.blogspot.com

GSquared SSC Guru Points: 260824 More actions · Answer 3

I finally had time to test this, and I found something I hadn't noticed before. Your test data (first post) has duplicate IDs and all the dates and times are the same. Is that an accident, or does your real data also have duplicate ID values? Also, the test table has no primary key. Is that true of the real table or just an oversight on the test?

In the hope that your data doesn't actually have those problems, I created the following test and code:

CREATE TABLE #mytable

( ID INT IDENTITY(1,1) ,

DateValue DATETIME, Prd nvarchar(250), Statusid INT )

INSERT INTO #mytable

(DateValue,Prd, Statusid)

SELECT 'Oct 17 2007 12:00AM','A','1' UNION ALL

SELECT 'Oct 17 2007 12:00AM','B','1' UNION ALL

SELECT 'Oct 17 2007 12:00AM','B','2' UNION ALL

SELECT 'Oct 17 2007 12:00AM','B','1' UNION ALL

SELECT 'Oct 17 2007 12:00AM','B','2' UNION ALL

SELECT 'Oct 17 2007 12:00AM','A','1' UNION ALL

SELECT 'Oct 17 2007 12:00AM','A','2'

update #MyTable

set DateValue = dateadd(minute, ID, DateValue)

alter table #MyTable

add SetFirst bit not null default(0), SetLast bit not null default(0)

;with CTE (Row, Date, Prd, StatusID, SetFirst, SetLast) as

(select row_number() over (partition by Prd order by DateValue),

DateValue, Prd, StatusID, SetFirst, SetLast

from #MyTable)

update C1

set

SetFirst =

case

when C2.row is not null then 1

when not exists

(select *

from #MyTable mt

where mt.prd = c1.prd

and mt.datevalue < c1.date) then 1

else 0

end,

SetLast =

case

when C3.row is not null then 1

when not exists

(select *

from #MyTable mt

where mt.prd = c1.prd

and mt.datevalue > c1.date) then 1

else 0

end

from CTE C1

left outer join CTE C2 -- The prior row is a 2

on C1.Prd = C2.Prd

and C1.Row = C2.Row+1

and C1.StatusID = 1

and C2.StatusID = 2

left outer join CTE C3 -- Status 2 and next row is a 1

on C1.Prd = C3.Prd

and C1.Row = C3.Row-1

and C1.StatusID = 2

and C3.StatusID = 1

select *

from #MyTable

Because of the Not Exists tests, it has at least that much row-by-row. There's almost certainly a better way to do that particular test, but I don't have time to mess around with that right now. This way works, at least on this data.

Another thing I noticed about your test data is that you had two StatusID 2 in a row for Prd "B". That seems to indicate a close without an open. I got rid of that for my test, but if it really exists in your database, the code will have to correct for that.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

simon phoenix-479217 SSCommitted Points: 1573 More actions · Answer 4

Hi

GSquared

After runnig your both the query, for set and sequece, this is what i got

ID DateValue Prd statusid setfirst setlast setid setsequence

12007-10-17 00:01:00.000A11011

22007-10-17 00:02:00.000A10021

32007-10-17 00:03:00.000A10001

42007-10-17 00:04:00.000A10002

52007-10-17 00:05:00.000B11003

62007-10-17 00:06:00.000B10004

72007-10-17 00:07:00.000B10005

82007-10-17 00:08:00.000B20106

92007-10-17 00:09:00.000B11007

102007-10-17 00:10:00.000B20108

112007-10-17 00:11:00.000A10009

122007-10-17 00:12:00.000A100010

132007-10-17 00:13:00.000A100011

142007-10-17 00:14:00.000A100012

152007-10-17 00:15:00.000A201013

but i was looking for something like this! and setfirst was good escept for the last Prd A

ID DateValue Prd statusid setfirst setlast setid setsequence

12007-10-17 00:01:00.000A11011

22007-10-17 00:02:00.000A10021

32007-10-17 00:03:00.000A10001

42007-10-17 00:04:00.000A10001

52007-10-17 00:05:00.000B11001

62007-10-17 00:06:00.000B10001

72007-10-17 00:07:00.000B10001

82007-10-17 00:08:00.000B20101

92007-10-17 00:09:00.000B11002

102007-10-17 00:10:00.000B20102

112007-10-17 00:11:00.000A11003

122007-10-17 00:12:00.000A10003

132007-10-17 00:13:00.000A10003

142007-10-17 00:14:00.000A10003

152007-10-17 00:15:00.000A20103

So sequence 1 is one block , Sequence two is another block and Sequence 3 is one more block

and after acheiving this i can trap the first and last record of each sequence which is my ultimate goal

I guess i am not messing up your labor day weekend!

I am really excited and thrilled from your unconditional help for new guy in SQL

Thanks

Simon

simon phoenix-479217 SSCommitted Points: 1573 More actions · Answer 5

Hi

Gsquared

i copied the previous data from sql pane and they are not aligining to each other, i think you can

estimate its order. Please let me know if its not readable

Thanks

Simon

steve dassin SSCrazy Points: 2396 More actions · Answer 6

The common problem of dividing data into groups boils down to creating a dense rank. Here we want a dense rank on Statusid but ordered by ID (or DateValue). This problem should be easily solved using the Dense_Rank() function in S2005. Unfortunately all the ranking functions assume the target of the rank is also its order. The order cannot be independent of the target. This dependency was intended to be solved by the full implementation of the OVER statement in sql-99. This is known as the sql window. With the full OVER the target of a rank and its order can be independently specified. The Dense_Rank() function is actually a short hand for a generalized dense rank (using OVER). This rank is based on a 'running sum' over the target column (or more generally an expression involving the column) and can be ordered by another column.The running sum is

equivalent to a dense rank.

To get this type of dense rank you simply have to state the rule that increments the run. This is the same as deciding what rule bounds each rank (group). In this problem each rank is bounded by a value of 1 followed by a 2. The run is incremented whenever there is a Statusid change from a some value other than 1 (2,null) to 1. Here all you have to do is check the prior row value of Statusid. If the prior row (based in the order of ID) is a 2 the current row starts a new group (the running sum is incremented by 1).

Sample data.

CREATE TABLE #mytable

(ID INT PRIMARY KEY,

DateValue DATETIME, Prd VARCHAR(10), Statusid INT )

INSERT INTO #mytable

(ID, DateValue, Prd, Statusid)

SELECT 1, '2007-10-17 00:01:00.000','A',1 UNION ALL

SELECT 2, '2007-10-17 00:02:00.000','A',1 UNION ALL

SELECT 3, '2007-10-17 00:03:00.000','A',1 UNION ALL

SELECT 4, '2007-10-17 00:04:00.000','A',1 UNION ALL

SELECT 5, '2007-10-17 00:05:00.000','B',1 UNION ALL

SELECT 6, '2007-10-17 00:06:00.000','B',1 UNION ALL

SELECT 7, '2007-10-17 00:07:00.000','B',1 UNION ALL

SELECT 8, '2007-10-17 00:08:00.000','B',2 UNION ALL

SELECT 9, '2007-10-17 00:09:00.000','B',1 UNION ALL

SELECT 10,'2007-10-17 00:10:00.000','B',2 UNION ALL

SELECT 11,'2007-10-17 00:11:00.000','A',1 UNION ALL

SELECT 12,'2007-10-17 00:12:00.000','A',1 UNION ALL

SELECT 13,'2007-10-17 00:13:00.000','A',1 UNION ALL

SELECT 14,'2007-10-17 00:14:00.000','A',1 UNION ALL

SELECT 15,'2007-10-17 00:15:00.000','A',2 UNION ALL

An outer join would make it easy to check the prior Statusid.

SELECT A.ID,A.DateValue,A.Prd,A.Statusid,B.ID,B.Statusid

FROM #mytable AS A LEFT JOIN #mytable AS B

ON A.ID=B.ID+1

ORDER BY A.ID

Using the full implemention of OVER (sql-99 window) and the Sum aggregate

you could form the groups, get a dense rank, with one pass thru the join:

SELECT A.ID,A.DateValue,A.Prd,A.Statusid,

Sum(CASE WHEN B.Statusid=2 THEN 1 ELSE 0 END)

OVER(ORDER BY A.ID

ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)

AS Setsequence

FROM #mytable AS A LEFT JOIN #mytable AS B

ON A.ID=B.ID+1

Even the join could be eliminated by using a LAG function of OVER (gets the

previous ID value)

SELECT ID,DateValue,Prd,Statusid,

Sum(CASE WHEN LAG(ID,1)=2 THEN 1 ELSE 0 END)

OVER(ORDER BY ID

ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)

AS Setsequence

FROM #mytable

Well all this is pie in the sky for sql server because MS for some other worldy

reason(s) decided the full implementation of OVER could wait. The are various ways to simulate OVER such as combinations of ranking functions/counts, updates, subqueries etc. None are as simple and straightforward as OVER. Of course for sql server I prefer to use Rac:)

Exec Rac

@TRANSFORM='_dummy_', -- Plays no part in computations.

@ROWS='ID & DateValue(d) & Prd & Statusid',

@PVTCOL='report mode',

@FROM='#mytable',

@DEFAULTS1='y',@RACHECK='y',@SHELL='n',

-- @ROWRUNS is a running sum that does not involve a @TRANSFORM, hense Dumy:)

@ROWRUNS='Dumy^CASE WHEN PRIOR.Statusid=2 THEN 1 ELSE 0 END^(Dumy)',

@ROWRUNSLABEL='DenseRank',

-- Add 1 to the rank to start at 1 (instead of 0).

@SELECT='SELECT _ROWS_,(1*DenseRank)+1 AS Setsequence

FROM rac

ORDER BY rd'

ID DateValue Prd Statusid Setsequence

------ ------------------- ---- -------- -----------

1 Oct 17 2007 12:01AM A 1 1

2 Oct 17 2007 12:02AM A 1 1

3 Oct 17 2007 12:03AM A 1 1

4 Oct 17 2007 12:04AM A 1 1

5 Oct 17 2007 12:05AM B 1 1

6 Oct 17 2007 12:06AM B 1 1

7 Oct 17 2007 12:07AM B 1 1

8 Oct 17 2007 12:08AM B 2 1

9 Oct 17 2007 12:09AM B 1 2

10 Oct 17 2007 12:10AM B 2 2

11 Oct 17 2007 12:11AM A 1 3

12 Oct 17 2007 12:12AM A 1 3

13 Oct 17 2007 12:13AM A 1 3

14 Oct 17 2007 12:14AM A 1 3

15 Oct 17 2007 12:15AM A 2 3

You could flag 1st and last rows of each group or whatever. But I think I'll stop here:)

http://www.rac4sql.net

http://www.beyondsql.blogspot.com

simon phoenix-479217 SSCommitted Points: 1573 More actions · Answer 7

Hi

Rog

Thanks for your input on this!

Is there any other way to get this result without using RAC? thats the exact result that i was looking

for . I am using sql2k5 , that is also a big problem to use RAC

Thanks

simon

Jeff Moden SSC Guru Points: 1004686 More actions · Answer 8

Ummm... I thought Rog gave a solution that didn't include RAC... take a look at his post again. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

simon phoenix-479217 SSCommitted Points: 1573 More actions · Answer 9

Hi

Rog, Jeff

I have problem running this statement

SELECT A.ID,A.DateValue,A.Prd,A.Statusid,

Sum(CASE WHEN B.Statusid=2 THEN 1 ELSE 0 END)

OVER(ORDER BY A.ID

ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)-- this line throws error

AS Setsequence

FROM #mytable AS A LEFT JOIN #mytable AS B

ON A.ID=B.ID+1

Msg 102, Level 15, State 1, Line 4

Incorrect syntax near 'ROWS'.

Thanks

Simon

steve dassin SSCrazy Points: 2396 More actions · Answer 10

simon phoenix (8/30/2008)
Hi
Rog, Jeff
I have problem running this statement
SELECT A.ID,A.DateValue,A.Prd,A.Statusid,
Sum(CASE WHEN B.Statusid=2 THEN 1 ELSE 0 END)
OVER(ORDER BY A.ID
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)-- this line throws error
AS Setsequence
FROM #mytable AS A LEFT JOIN #mytable AS B
ON A.ID=B.ID+1
Msg 102, Level 15, State 1, Line 4
Incorrect syntax near 'ROWS'.
Thanks
Simon

Sorry for any confusion. The above form of OVER is NOT supported in S2k5. But RAC does run in S2k5 🙂 Anyway here is a t-sql solution that will simulate the running sum of OVER. Actually it's not as bad as I perhaps made it out to be but its no substitute for the real thing. I'm using the same sample data used in RAC.

WITH C AS

(

SELECT A.ID,A.DateValue,A.Prd,A.Statusid,B.Statusid AS PriorStatusid

FROM #mytable AS A LEFT JOIN #mytable AS B

ON A.ID=B.ID+1

)

select D.ID,D.DateValue,D.Prd,D.Statusid,

Sum(CASE WHEN E.PriorStatusid=2 THEN 1 ELSE 0 END)+1 AS Setsequence

FROM C AS D JOIN C AS E

ON E.ID BETWEEN 0 AND D.ID

GROUP BY D.ID,D.DateValue,D.Prd,D.Statusid

ORDER BY D.ID

Note there are 2 joins, one to access the prior row Statusid and one to form the rows for grouping. It's these joins that OVER (and RAC) eliminate. For large data sets performance could be a problem here. But that's another issue 🙂

http://www.rac4sql.net

http://www.beyondsql.blogspot.com

Jeff Moden SSC Guru Points: 1004686 More actions · Answer 11

Sorry... My bad... Roger's too. That syntax, starting at ROWS, isn't part of T-SQL in SQL Server 2005.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

simon phoenix-479217 SSCommitted Points: 1573 More actions · Answer 12

Hi

Rog

You are the SQl Guru, Thanks a lot , i really appreciat it, it was just bugging my head since last week. Now only job is to trap the begining and end of those distinct sequence and i think i can do it.

If i ran into trouble , i will post it again, please look into this thread for few more days, i am really close in getting my task done with the help of great guys like you, Gsquared and jeff. i can't explain

how greatful i am to find good people like you guys exist

God bless you Rog, Jeff and Gsquared

Thanks

Simon

----------Happy Labor Day----------

Jeff Moden SSC Guru Points: 1004686 More actions · Answer 13

Heh... ok... my turn. 🙂 And sorry it took me so long to get to this problem...

First, in order to be fast and short, this fun problem does actually require some procedureal code, but we won't be using a Cursor or a While loop. And, we don't need extra columns nor do we need to buy a special product to get it done. AND, we don't need to self join to the same table 4 times to do it either. All we need is what appears to be the correct logical clustered index which you probably already have anyway.

First, the test data... do read the comments please... this little trick requires a clustered index on the ID column...

drop table #yourtable

go

--===== Setup the demonstration table...

-- I'm assuming that the ID column is the CLUSTERED PRIMARY KEY

-- and that it's named PK_YourTable_ID. Please adjust as necessary.

-- I'm also assuming that it has a "SetSequence" column.

CREATE TABLE #YourTable

(

ID INT IDENTITY(1,1),

DateValue DATETIME,

Prd CHAR(1),

StatusID TINYINT,

SetSequence INT,

CONSTRAINT PK_YourTable_ID PRIMARY KEY CLUSTERED (ID)

)

INSERT INTO #YourTable

(DateValue,Prd,StatusID)

SELECT '2007-10-17 00:01:00.000','A','1' UNION ALL

SELECT '2007-10-17 00:02:00.000','A','1' UNION ALL

SELECT '2007-10-17 00:03:00.000','A','1' UNION ALL

SELECT '2007-10-17 00:04:00.000','A','1' UNION ALL

SELECT '2007-10-17 00:05:00.000','B','1' UNION ALL

SELECT '2007-10-17 00:06:00.000','B','1' UNION ALL

SELECT '2007-10-17 00:07:00.000','B','1' UNION ALL

SELECT '2007-10-17 00:08:00.000','B','2' UNION ALL

SELECT '2007-10-17 00:09:00.000','B','1' UNION ALL

SELECT '2007-10-17 00:10:00.000','B','2' UNION ALL

SELECT '2007-10-17 00:11:00.000','A','1' UNION ALL

SELECT '2007-10-17 00:12:00.000','A','1' UNION ALL

SELECT '2007-10-17 00:13:00.000','A','1' UNION ALL

SELECT '2007-10-17 00:14:00.000','A','1' UNION ALL

SELECT '2007-10-17 00:15:00.000','A','2'

GO

--===== Show the content of the test table

SELECT * FROM #YourTable

The key to this problem is very much like "trailing edge digital logic". We don't actually care where the 1's and 2's are... we only care when the StatusID drops in value. To translate into pseudo code, when the StatusID of the previous row is LESS THAN OR EQUAL TO the current row, the SetSequence will remain the same, otherwise, we need to add 1 to the SetSequence.

Now, the clustered index makes it easy using the very same technique in the following article...

http://www.sqlservercentral.com/articles/Advanced+Querying/61716/

... here's the code that does it... at a rate of a million rows every 7 seconds... 😛

--===== Declare some necessary variables

DECLARE @Sequence INT, --Remembers the current sequence

@PrevStatusID TINYINT --Remembers the StatusID from the previous row

SELECT @Sequence = 1,

@PrevStatusID = 1

--===== Do a procedural update using SQL Server's proprietary update

UPDATE #YourTable

SET @Sequence = SetSequence = CASE WHEN @PrevStatusID <= StatusID THEN @Sequence ELSE @Sequence+1 END,

@PrevStatusID = StatusID

FROM #YourTable WITH(INDEX(PK_YourTable_ID))

--===== Show the results

SELECT * FROM #YourTable

Questions? 🙂

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004686 More actions · Answer 14

simon phoenix (8/30/2008)
Hi
Rog
You are the SQl Guru, Thanks a lot , i really appreciat it, it was just bugging my head since last week. Now only job is to trap the begining and end of those distinct sequence and i think i can do it.
If i ran into trouble , i will post it again, please look into this thread for few more days, i am really close in getting my task done with the help of great guys like you, Gsquared and jeff. i can't explain
how greatful i am to find good people like you guys exist
God bless you Rog, Jeff and Gsquared
Thanks
Simon
----------Happy Labor Day----------

Simon, "you da man!" Thanks for the awesome thought. Happy Labor Day!

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004686 More actions · Answer 15

simon phoenix (8/30/2008)
Now only job is to trap the begining and end of those distinct sequence and i think i can do it.

Missed that... sorry. Easy to do now that the SetSequence is done...

--===== Show first and last date of each SetSequence

SELECT SetSequence, MIN(DateValue) AS StartDateTime, MAX(DateValue) AS EndDateTime

FROM #YourTable

GROUP BY SetSequence

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)