Select record from group

Question

Select record from group

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply

Sean Lange SSC Guru Points: 286590 More actions · Answer 1

ChrisM@home (9/12/2013)
paulcauchon (9/12/2013)
I see the hitch up here...you need to see the min date and max date from each of the consecutive color/ID group pairings.
Originally, we had been treating the grouping value (business key) as the pairing of color and ID. The key is actually the pairing of color and ID over sequential, uninterrupted AUTOID values.
In this case, the simple solution that the group has previously been advocating will not work. At first glance, this is probably going to require either really complicated logic or row by row processing. I'll keep considering the "complicated logic" approach (as to not anger Lord Moden), and report back if I come up with anything.
Not that complicated;
SELECT ID, DATEFROM = MIN(DATEFROM), DATETO = MAX(DATETO), COLOR
FROM (
SELECT autoid, ID, DATEFROM, DATETO, COLOR,
NewGroup = ROW_NUMBER() OVER(PARTITION BY ID ORDER BY COLOR) - AUTOID
FROM #mytable
) d
GROUP BY ID, COLOR, NewGroup
ORDER BY ID, MIN(DATEFROM)

Thanks Chris. I have used that method before for sets but didn't even think of it here. Sweet!!!

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

adonetok SSCrazy Eights Points: 8395 More actions · Answer 2

Chris, thank you again. This is a magic.

I need to learn

"ROW_NUMBER() OVER(PARTITION BY ID ORDER BY COLOR) - AUTOID"

Would you mind to explain more about this statement?

paulcauchon SSC Veteran Points: 245 More actions · Answer 3

ChrisM@home (9/12/2013)
Not that complicated;

Indeed, not that complicated. But terribly clever!

If I'm understanding this correctly, we're lining up all of the ID and Color groups in order, and then ascribing row numbers to them. From these rownumbers, we subtract the AUTOID. It's easiest to understand if we line up side by side the AUTOID column, the output of row_number(), and the NewGroup column:

SELECTautoid

,RowNumber = ROW_NUMBER() OVER(PARTITION BY ID ORDER BY COLOR)

,NewGroup = row_number() over(partition by ID order by color) - autoID

,ID, DATEFROM, DATETO, COLOR -- record contents

FROM #mytable

AUTOID rowNumber NewGroup ID DATEFROM DATETO COLOR

----------- ----------- ----------- ---- ----------------------- ----------------------- ----------

1 1 0 111 2012-01-10 00:00:00.000 2012-01-11 00:00:00.000 BLUE

2 2 0 111 2012-02-02 00:00:00.000 2012-02-02 00:00:00.000 BLUE

3 3 0 111 2012-03-15 00:00:00.000 2012-03-16 00:00:00.000 BLUE

7 4 -3 111 2012-07-17 00:00:00.000 2012-07-17 00:00:00.000 BLUE

8 5 -3 111 2012-09-08 00:00:00.000 2012-10-10 00:00:00.000 BLUE

17 6 -11 111 2012-01-10 00:00:00.000 2012-03-16 00:00:00.000 BLUE

19 7 -12 111 2012-07-17 00:00:00.000 2012-10-10 00:00:00.000 BLUE

18 8 -10 111 2012-04-05 00:00:00.000 2012-06-16 00:00:00.000 GREEN

4 9 5 111 2012-04-05 00:00:00.000 2012-04-05 00:00:00.000 GREEN

5 10 5 111 2012-05-25 00:00:00.000 2012-05-26 00:00:00.000 GREEN

6 11 5 111 2012-06-06 00:00:00.000 2012-06-16 00:00:00.000 GREEN

15 1 -14 222 2013-07-17 00:00:00.000 2013-07-17 00:00:00.000 BLUE

16 2 -14 222 2013-09-08 00:00:00.000 2013-10-10 00:00:00.000 BLUE

22 3 -19 222 2013-07-17 00:00:00.000 2013-10-10 00:00:00.000 BLUE

21 4 -17 222 2013-04-05 00:00:00.000 2013-06-16 00:00:00.000 GREEN

12 5 -7 222 2013-04-05 00:00:00.000 2013-04-05 00:00:00.000 GREEN

13 6 -7 222 2013-05-25 00:00:00.000 2013-05-26 00:00:00.000 GREEN

14 7 -7 222 2013-06-06 00:00:00.000 2013-06-16 00:00:00.000 GREEN

20 8 -12 222 2013-01-10 00:00:00.000 2013-03-16 00:00:00.000 RED

9 9 0 222 2013-01-10 00:00:00.000 2013-01-11 00:00:00.000 RED

10 10 0 222 2013-02-02 00:00:00.000 2013-02-02 00:00:00.000 RED

11 11 0 222 2013-03-15 00:00:00.000 2013-03-16 00:00:00.000 RED

(22 row(s) affected)

As the rowNumber increments, the difference between the AUTOID and rowNumber remains consistent across groups. We can then use this in our outer query as the third member of the business key (Color, ID, uninterrupted consecutive AUTOIDs).

What I would love to understand is why we shouldn't include in the ORDER BY statement of the OVER clause something that will ensure that the records within a "NewGroup" group are processed in order, a la:

FROM (

SELECT autoid, ID, DATEFROM, DATETO, COLOR,

NewGroup = ROW_NUMBER() OVER(PARTITION BY ID ORDER BY COLOR, AUTOID) - AUTOID

FROM #mytable

) d

Am I correct in assuming that if we did not have the clustered index on the table we would need to have the AUTOID in the ORDER BY statement?

Dwain Camps SSC Guru Points: 86978 More actions · Answer 4

ChrisM@home (9/12/2013)
paulcauchon (9/12/2013)
I see the hitch up here...you need to see the min date and max date from each of the consecutive color/ID group pairings.
Originally, we had been treating the grouping value (business key) as the pairing of color and ID. The key is actually the pairing of color and ID over sequential, uninterrupted AUTOID values.
In this case, the simple solution that the group has previously been advocating will not work. At first glance, this is probably going to require either really complicated logic or row by row processing. I'll keep considering the "complicated logic" approach (as to not anger Lord Moden), and report back if I come up with anything.
Not that complicated;
SELECT ID, DATEFROM = MIN(DATEFROM), DATETO = MAX(DATETO), COLOR
FROM (
SELECT autoid, ID, DATEFROM, DATETO, COLOR,
NewGroup = ROW_NUMBER() OVER(PARTITION BY ID ORDER BY COLOR) - AUTOID
FROM #mytable
) d
GROUP BY ID, COLOR, NewGroup
ORDER BY ID, MIN(DATEFROM)

+1

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 5

paulcauchon (9/12/2013)
ChrisM@home (9/12/2013)
Not that complicated;
...
If I'm understanding this correctly, we're lining up all of the ID and Color groups in order, and then ascribing row numbers to them. From these rownumbers, we subtract the AUTOID. It's easiest to understand if we line up side by side the AUTOID column, the output of row_number(), and the NewGroup column:
SELECTautoid
,RowNumber = ROW_NUMBER() OVER(PARTITION BY ID ORDER BY COLOR)
,NewGroup = row_number() over(partition by ID order by color) - autoID
,ID, DATEFROM, DATETO, COLOR -- record contents
FROM #mytable

Paul - if you order the output by AutoID, it all becomes clear. There's a description of how the method works here[/url]. The inventor, as far as I know, was Itzik Ben-Gan - but I don't have a reference, sorry.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

paulcauchon SSC Veteran Points: 245 More actions · Answer 6

ChrisM@Work (9/13/2013)
...
Paul - if you order the output by AutoID, it all becomes clear. There's a description of how the method works here[/url]. The inventor, as far as I know, was Itzik Ben-Gan - but I don't have a reference, sorry.

Whoa...excellent article there. Very in depth and thorough. Feeling kinda like Neo!

This tool (and the rest of the methods mentioned in that article) are excellent to have in my back pocket.

Thanks for the explanation!

SQL Severe DBA Hall of Fame Points: 3579 More actions · Answer 7

Pretty good explaination. Worth spending time on the read.

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 8

paulcauchon (9/13/2013)
ChrisM@Work (9/13/2013)
...
Paul - if you order the output by AutoID, it all becomes clear. There's a description of how the method works here[/url]. The inventor, as far as I know, was Itzik Ben-Gan - but I don't have a reference, sorry.
Whoa...excellent article there. Very in depth and thorough. Feeling kinda like Neo!
This tool (and the rest of the methods mentioned in that article) are excellent to have in my back pocket.
Thanks for the explanation!

Bwaaahaaaa!!! That's too funny Paul!

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]