December 14, 2011 at 5:02 am
Hello All
I need some pointers on how to get started with a query. The problem is how to group some data in this scenario:
Say I have a table like so:
value date
501/01/2011 00:00:00
502/01/2011 00:00:00
503/01/2011 00:00:00
1004/01/2011 00:00:00
1005/01/2011 00:00:00
1006/01/2011 00:00:00
507/01/2011 00:00:00
508/01/2011 00:00:00
509/01/2011 00:00:00
I would like to group on 'value' but take into account the 'clustering', so the result would be for example:
value count
5 3
10 3
5 3
I'm guessing I have to create some dummy value for each 'cluster' but have no idea where to start.
Cheers all!
Philippe
December 14, 2011 at 5:10 am
How do you define cluster of rows in your table? If itβs the order of insertion then you are going on wrong track. The order of rows in table is not guaranteed. Please identify a good logic for clustering / grouping you data & add one identifier in your table (even can be calculated at runtime).
December 14, 2011 at 5:19 am
Hi Dev
The criteria for a cluster is that the records are identical (time field excepted) and contiguous when ordered by time. Records that are identical but not part of the same date sequence are grouped separately.
Hope that makes sense
Philippe
December 14, 2011 at 5:47 am
Maybe this one will do. I added the max. date to the group for sorting.
DECLARE @T TABLE ([Value] INT, [Date] DATETIME)
INSERT INTO @T([Value], [Date]) VALUES (5, '01/01/2011 00:00:00')
INSERT INTO @T([Value], [Date]) VALUES (5, '02/01/2011 00:00:00')
INSERT INTO @T([Value], [Date]) VALUES (5, '03/01/2011 00:00:00')
INSERT INTO @T([Value], [Date]) VALUES (10, '04/01/2011 00:00:00')
INSERT INTO @T([Value], [Date]) VALUES (10, '05/01/2011 00:00:00')
INSERT INTO @T([Value], [Date]) VALUES (10, '06/01/2011 00:00:00')
INSERT INTO @T([Value], [Date]) VALUES (5, '07/01/2011 00:00:00')
INSERT INTO @T([Value], [Date]) VALUES (5, '08/01/2011 00:00:00')
INSERT INTO @T([Value], [Date]) VALUES (5, '09/01/2011 00:00:00')
INSERT INTO @T([Value], [Date]) VALUES (10, '10/01/2011 00:00:00')
INSERT INTO @T([Value], [Date]) VALUES (10, '10/01/2011 00:00:00')
SELECT
[Value], MAX([Date]), COUNT(*) [Count]
FROM
(
SELECT
ROW_NUMBER() OVER (ORDER BY [Date]) - ROW_NUMBER() OVER (PARTITION BY [Value] ORDER BY [Date]) GrpID,
*
FROM
@T
) Q
GROUP BY
[Value], GrpID
ORDER BY
2, 1
December 14, 2011 at 5:56 am
DROP TABLE #Sample
CREATE TABLE #Sample (avalue INT, adate DATETIME)
INSERT INTO #Sample (avalue, adate)
SELECT 5, '01/01/2011 00:00:00' UNION ALL
SELECT 5, '02/01/2011 00:00:00' UNION ALL
SELECT 5, '03/01/2011 00:00:00' UNION ALL
SELECT 10, '04/01/2011 00:00:00' UNION ALL
SELECT 10, '05/01/2011 00:00:00' UNION ALL
SELECT 10, '06/01/2011 00:00:00' UNION ALL
SELECT 5, '07/01/2011 00:00:00' UNION ALL
SELECT 5, '08/01/2011 00:00:00' UNION ALL
SELECT 5, '09/01/2011 00:00:00'
;WITH OrderedData AS (
SELECT *,
rn1 = ROW_NUMBER() OVER (ORDER BY aDate),
rn2 = ROW_NUMBER() OVER (PARTITION BY avalue ORDER BY aDate)
FROM #Sample
)
SELECT
avalue,
aCount = COUNT(*)
FROM OrderedData
GROUP BY rn1-rn2, avalue
ORDER BY MAX(aDate)
-- it doesn't make much sense without an ordering column, e.g. MAX(aDate)
Quick today, Peter. I see you've also commented on a sort column π
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
December 14, 2011 at 6:25 am
ChrisM@home (12/14/2011)
Quick today, Peter. I see you've also commented on a sort column π
Funny, almost same query, same execution plan. Note that the solutions probably do not scale very well due to the 4 sort operations (one for each row_number() expression, one for the GROUP BY and one for the ORDER BY). A proper index on the date column (including the value column) might help a little.
December 14, 2011 at 8:01 am
Blinding! well done guys and many thanks
It's
ORDER BY [Date]) - ROW_NUMBER()
that I'd failed to see before as my dummy grouping
Thanks again
Philippe
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply