Select And Count Consecutive Numbers

Question

Select And Count Consecutive Numbers

Viewing 15 posts - 31 through 45 (of 45 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004405 More actions · Answer 1

Probably, Serqiy.... just I think they did it at a machine language level which would be a lot faster the at the "T-SQL Level".

Heh... I agree about the OO hints... kinda like that one fellow says as part of his sig... something like "DTS and Cursors are for those who don't understand T-SQL" Still, I haven't been able to beat a properly written CTE, yet... was hoping it was because my home machine is a bit slower than servers which is why I keep asking folks to give us timing on a million row example.

I finally got a copy of the Developer's Edition of 2K5... will install sometime this weekend... right after I backup my machine

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004405 More actions · Answer 2

Jeff, I got to understand your code, but found a bug in it.

Using your short test table, if you change the following:

Thanks Antonio... was pretty sure that was covered even without the extra variable but I'll double check...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004405 More actions · Answer 3

Antonio,

Heh... had to do a double take... but no... it's working correctly because of the DISTINCT transfer and the GROUP BY... here's the results with the original data and your data modification...

Results BEFORE Data Change

Results AFTER Data Change

EmployeeID

Freight

Items

EmployeeID

Freight

Items

1

1.35

2

1

1.35

2

1

4.98

2

1

4.98

2

1

8

2

1

8

2

1

9.01

4

1

9.01

4

2

4.98

3

2

4.98

3

2

8

3

2

8

3

2

9.01

4

2

9.01

4

3

9.07

2

Because the 3,9.07 was changed to 3,9.08 and there already was a 3,9.08, then according to the rules the OP posted, the two 3,9.08's are disqualified and the listing for EmployeeID #3 simply goes away...

Way to keep me on my toes, though

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Antonio Macedo SSC Enthusiast Points: 117 More actions · Answer 4

Jeff, thanks for your patience. I am learning a LOT on this thread.

Here is a variation on your code that will run faster than the original:

--===== If the table that holds the test data already exists drop it

IF OBJECT_ID('TempDB..#MyHead') IS NOT NULL

DROP TABLE #MyHead

--===== Recreate the test table and populate it with test data

CREATE TABLE #MyHead (EmployeeID INT,Freight DECIMAL(6,2))

INSERT INTO #MyHead (EmployeeID,Freight)

SELECT 1,0.21 UNION ALL

SELECT 1,0.45 UNION ALL

SELECT 1,0.93 UNION ALL

SELECT 1,1.27 UNION ALL

SELECT 1,1.35 UNION ALL --should be a "hit" with count of "2"

SELECT 1,1.36 UNION ALL

SELECT 1,1.51 UNION ALL

SELECT 1,1.66 UNION ALL

SELECT 1,2.50 UNION ALL

SELECT 1,3.94 UNION ALL

SELECT 1,4.03 UNION ALL

SELECT 1,4.27 UNION ALL

SELECT 1,4.41 UNION ALL

SELECT 1,4.98 UNION ALL --should be a "hit" with count of "2"

SELECT 1,4.99 UNION ALL

SELECT 1,4.99 UNION ALL --dupe (not included according to David's specs)

SELECT 1,7.46 UNION ALL

SELECT 1,8.00 UNION ALL --should be a "hit" with count of "2"

SELECT 1,8.01 UNION ALL

SELECT 1,9.01 UNION ALL --should be a "hit" with count of "4"

SELECT 1,9.02 UNION ALL

SELECT 1,9.03 UNION ALL

SELECT 1,9.04 UNION ALL

SELECT 2,4.27 UNION ALL --different employee id

SELECT 2,4.41 UNION ALL

SELECT 2,4.98 UNION ALL --should be a "hit" with count of "3"

SELECT 2,4.99 UNION ALL

SELECT 2,4.99 UNION ALL --dupe (not included according to David's specs)

SELECT 2,5.00 UNION ALL

SELECT 2,7.46 UNION ALL

SELECT 2,8.00 UNION ALL --should be a "hit" with count of "3"

SELECT 2,8.01 UNION ALL

SELECT 2,8.02 UNION ALL

SELECT 2,9.01 UNION ALL --should be a "hit" with count of "4"

SELECT 2,9.02 UNION ALL

SELECT 2,9.02 UNION ALL --dupe (not included according to David's specs)

SELECT 2,9.03 UNION ALL

SELECT 2,9.04 UNION ALL

SELECT 2,9.07 UNION ALL

SELECT 3,9.07 UNION ALL --just to show one EmployeeID won't bleed into the next

SELECT 3,9.08

--===== If the table that holds the isolated data already exists drop it

IF OBJECT_ID('TempDB..#Work') IS NOT NULL

DROP TABLE #Work

--===== Create and populate the work table with a clustered primary key and a special

CREATE TABLE #Work

(

EmployeeID

INT NOT NULL,

Freight

DECIMAL(6,2) NOT NULL,

MyGroup

INT, --This is the "special" column

PRIMARY KEY CLUSTERED (EmployeeID,Freight)

)

INSERT INTO #Work (EmployeeID,Freight)

SELECT DISTINCT

t1

.EmployeeID,t1.Freight

FROM #MyHead t1,

#MyHead t2

WHERE (t1.Freight+.01 = t2.Freight)

AND t1.EmployeeID = t2.EmployeeID

--***************************************************

--*** Here I removed the second part of the union ***

--***************************************************

--=======================================================================================

-- This bit of "trick" code relies on the order of the data which is

-- forced by the Clustered Primary Key

--===== Declare a couple of local variables to support the "trick" code

DECLARE

@MyGroup INT

SET @MyGroup = 0

DECLARE

@LastFreight DECIMAL(6,2)

SET @LastFreight = -1

--===== Here's the "trick"... this "groups" rows that are .01 apart in Freight using the

-- proprietary UPDATE clause of SQL Server... magic... and blows cursor speed away!!!

UPDATE w

SET @MyGroup = MyGroup = CASE

WHEN Freight = @LastFreight + .01

THEN @MyGroup

ELSE @MyGroup+1

END,

@LastFreight

= Freight

FROM #Work w --Order of processing is by the Clustered Primary key

--=======================================================================================

-- So, with that having been done, it's now easy to get the results you want.

--=======================================================================================

--*********************************************************************************

--*** Here I add 1 to the count to compensate for the "missing" trailing record ***

--*********************************************************************************

SELECT

EmployeeID, MIN(Freight) AS Freight,COUNT(*) + 1 AS Items

FROM #Work

GROUP BY EmployeeID,MyGroup

Antonio Macedo

Jeff Moden SSC Guru Points: 1004405 More actions · Answer 5

Very cool... nice job, Antonio... takes only 28 seconds on the million row test...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 6

Here is a solution that only takes about 12 seconds to do the calculation and present the data...

-- Create sample data

SELECT

TOP 1000000

CAST(RAND(CAST(NEWID() AS VARBINARY)) * 1000 + 1 AS INT) AS EmployeeID,

CAST(RAND(CAST(NEWID() AS VARBINARY)) * 10 AS DECIMAL(6,2)) AS Freight

INTO

#MyHead

FROM

Master.dbo.SysColumns as sc1 WITH (NOLOCK),

Master

.dbo.SysColumns as sc2 WITH (NOLOCK)

-- Create a clustered index over Freight

CREATE

CLUSTERED INDEX IX_Freight ON #MyHead (EmployeeID, Freight)

-- Add an auxiliary column

ALTER

TABLE #MyHead

ADD

Grp INT

-- Do the grouping thingy

DECLARE

@Freight DECIMAL(6, 2),

@Grp

INT,

@Now

DATETIME

-- Start the clock

SELECT

@Freight = MIN(Freight) - 1,

@Grp

= 0,

@Now

= CURRENT_TIMESTAMP

FROM

#MyHead

UPDATE

#MyHead

SET

@Grp = Grp = CASE WHEN Freight > @Freight + 0.01 THEN @Grp + 1 ELSE @Grp END,

@Freight

= Freight

-- Show the result

SELECT

EmployeeID,

MIN(Freight) AS Freight,

COUNT(DISTINCT Freight) AS [Count]

FROM

#MyHead

GROUP

BY Grp,

EmployeeID

HAVING

COUNT(DISTINCT Freight) > 1

ORDER

BY EmployeeID,

MIN(Freight)

SELECT

DATEDIFF(MILLISECOND, @Now, CURRENT_TIMESTAMP) AS Milliseconds

-- Clean up

drop

table #myhead

N 56°04'39.16"
E 12°55'05.25"

Jeff Moden SSC Guru Points: 1004405 More actions · Answer 7

You know what they say... "Tuning makes perfect" Keeps getting better and better...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

This solution is really cute, and timely too - there are several current threads looking to a sequential cumulative component. But this statement type doesn't unfortunately work in a SELECT ("A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations."). Does it work with 2k5?

Cheers

ChrisM

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 9

Here is a set-based method that will work on both SQL 2000 and SQL 2005.

-- Prepare sample data

DECLARE

@Sample TABLE (EmployeeID INT, Freight SMALLMONEY)

INSERT

@Sample

SELECT

1, 4.99 UNION ALL

SELECT

1, 0.21 UNION ALL

SELECT

1, 4.03 UNION ALL

SELECT

1, 1.36 UNION ALL

SELECT

1, 0.93 UNION ALL

SELECT

1, 4.98 UNION ALL

SELECT

1, 1.35 UNION ALL

SELECT

1, 0.45 UNION ALL

SELECT

1, 1.51 UNION ALL

SELECT

1, 4.99 UNION ALL

SELECT

1, 1.66 UNION ALL

SELECT

1, 2.50 UNION ALL

SELECT

1, 4.27 UNION ALL

SELECT

1, 3.94 UNION ALL

SELECT

1, 4.41 UNION ALL

SELECT

1, 1.27 UNION ALL

SELECT

1, 7.46

-- Show the expected output

SELECT

d1.EmployeeID,

d1

.Freight,

CAST(1 + 100 * MIN(d2.Freight) - 100 * d1.Freight AS INT) AS [Count]

FROM

(

SELECT EmployeeID,

Freight

FROM @Sample AS t1

WHERE NOT EXISTS (

SELECT *

FROM @Sample AS t2

WHERE t1.Freight - t2.Freight = 0.01

AND t1.EmployeeID = t2.EmployeeID

)

) AS d1

INNER

JOIN (

SELECT EmployeeID,

Freight

FROM @Sample AS t1

WHERE NOT EXISTS (

SELECT *

FROM @Sample AS t2

WHERE t2.Freight - t1.Freight = 0.01

AND t2.EmployeeID = t1.EmployeeID

)

) AS d2 ON d1.Freight <= d2.Freight AND d1.EmployeeID = d2.EmployeeID

GROUP

BY d1.EmployeeID,

d1

.Freight

HAVING

MIN(d2.Freight) > d1.Freight

N 56°04'39.16"
E 12°55'05.25"

Jeff Moden SSC Guru Points: 1004405 More actions · Answer 10

Sorry, Chris... lost track of this thread...

Specifically, what statement gave you that error?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 11

Yeah me too!

I saw a use for ...

UPDATE #MyHead

SET

@Grp = Grp = CASE WHEN Freight > @Freight + 0.01 THEN @Grp + 1 ELSE @Grp END,

@Freight

= Freight

... in the work I'm currently involved with and tried this syntax in a SELECT, but it only works with an UPDATE.

Cheers

ChrisM

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Jeff Moden SSC Guru Points: 1004405 More actions · Answer 12

Yeah... that's the bugger about it... @variable=column=formula ONLY works for Update and only with certain other "hooks" present....

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

noeld SSC Guru Points: 96590 More actions · Answer 13

and

@variable=column=formula

is equivalent to:

SET @variable=formula, column = formula

That is why it works for "Update" statements only

Cheers,

* Noel

harsha.majety SSCommitted Points: 1537 More actions · Answer 14

My friend gave me an interesting problem yesterday. He works for a hospital AND needs TO COUNT patient visits, IF a patient visits the doctor ON consecutive DAYS, it needs TO be counted AS 1 visit.

Here is how i solved it. This is just one OF many ways TO do it.

DECLARE @tempTable TABLE(row_id INT IDENTITY(1,1),

patient_id INT,

er_date DATETIME)

-- Test Data

INSERT INTO @tempTable(patient_id,er_date)

SELECT 1,'1/1/2012'

UNION ALL

SELECT 1,'1/2/2012'

UNION ALL

SELECT 1,'1/3/2012'

UNION ALL

SELECT 2,'1/5/2012'

UNION ALL

SELECT 2,'1/6/2012'

UNION ALL

SELECT 3,'1/8/2012'

UNION ALL

SELECT 3,'1/9/2012'

UNION ALL

SELECT 1,'2/1/2012'

UNION ALL

SELECT 1,'2/2/2012'

UNION ALL

SELECT 2,'2/3/2012'

UNION ALL

SELECT 2,'3/3/3012'

UNION ALL

SELECT 3,'1/10/2012'

UNION ALL

SELECT 3,'1/13/2012'

UNION ALL

SELECT 5,'1/5/2012'

--If datediff is only 1 , it is treated as same visit

SELECT

tbl1.patient_id,

SUM( CASE WHEN tbl1.er_date IS NOT NULL AND tbl2.er_date IS NULL THEN 1 ELSE 0 END ) [number_of_visits]

FROM

@tempTable tbl1

LEFT JOIN @tempTable tbl2

ON tbl1.patient_id = tbl2.patient_id

AND tbl1.er_date = DATEADD(dd,1,tbl2.er_date)

GROUP BY

tbl1.patient_id;

dbursey SSC Eights! Points: 831 More actions · Answer 15

Some people are entertained by fast cars...others by fast code...GO GEEKS, I say. 🙂