September 15, 2011 at 1:27 pm
I have a table as follows:
date (uk format) result
01/01/2010 -1
02/01/2010 -1
03/01/2010 1
04/01/2010 1
05/01/2010 -1
06/01/2010 -1
07/01/2010 -1
08/01/2010 -1
09/01/2010 1
etc
date is sequential and result is only 1 or -1
I want to find the longest sequence of 1's. Same for -1's
Using above date, for 1's it's 2 (03-04/01/2010) and for -1's it's 4 (05-08/01/2010).
There are thousands of rows and the 1's & -1's are totally random
I've tried using various ranking/partition functions but to no avail :crazy:
Seems so simple but it's driving me nuts
Any suggestions?
Thanks
September 15, 2011 at 1:47 pm
looking at it but my first try didn't make sense;
here's sample data for other posters to fiddle with:
;with MySampleData (TheDate,TheValue)
AS
(
SELECT CONVERT(datetime,'20100101'),'-1' UNION ALL
SELECT '20100201','-1' UNION ALL
SELECT '20100301','1' UNION ALL
SELECT '20100401','1' UNION ALL
SELECT '20100501','-1' UNION ALL
SELECT '20100601','-1' UNION ALL
SELECT '20100701','-1' UNION ALL
SELECT '20100801','-1' UNION ALL
SELECT '20100901','1'
)
SELECT
--ROW_NUMBER() OVER (PARTITION BY TheValue ORDER BY TheDate) AS RW,
MySampleData.*
FROM MySampleData
Lowell
September 15, 2011 at 1:47 pm
A suggestion .. read this article by Jeff Moden, it just may be what you are looking for:
September 15, 2011 at 2:15 pm
Oh,
and just to make life even more interesting I forgot to say the dates are only ever Monday through Friday so a sequence of 1's may have to account for a hole in the dates for weekends. ie a 1 on a friday is in the same range as a 1 on the following monday.
Ouch!!!
September 15, 2011 at 2:27 pm
bitbucket-25253 (9/15/2011)
A suggestion .. read this article by Jeff Moden, it just may be what you are looking for:
+1 for this as you have a somewhat classic gaps & islands problem. Adding the weekend gap does complicate it some.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
September 15, 2011 at 4:53 pm
Here is one way to tackle the problem - not yet including the weekend aspect....
IF OBJECT_ID('tempdb..#tagged') IS NOT NULL
DROP TABLE #tagged
IF OBJECT_ID('tempdb..#tagged') IS NOT NULL
DROP TABLE #counted
;WITH MySampleData (TheDate,TheValue)
AS
(
SELECT CONVERT(DATETIME,'20100101'),'-1' UNION ALL
SELECT '20100102','-1' UNION ALL
SELECT '20100103','1' UNION ALL
SELECT '20100104','1' UNION ALL
SELECT '20100105','-1' UNION ALL
SELECT '20100106','-1' UNION ALL
SELECT '20100107','-1' UNION ALL
SELECT '20100108','-1' UNION ALL
SELECT '20100109','1'
)
SELECT
MySampleData.*,
ROW_NUMBER() OVER(ORDER BY TheDate) -
ROW_NUMBER() OVER(PARTITION BY TheValue ORDER BY TheDate) AS group_tag
INTO #tagged
FROM MySampleData
SELECT TheValue
, group_tag
, COUNT(*) group_count
INTO
#counted
FROM
#tagged
GROUP BY
TheValue
, group_tag
SELECT TheValue
, MAX(group_count)
FROM
#counted
GROUP BY
TheValue
For the weekend problem, will you have dates in the table with maybe a zero in TheValue or will weekend dates not be there at all?
Also, ignoring weekends for the moment, will you have a row for every date in the range or will there be gaps that break the chains? So if a date is missing it is equivalent to a break in a sequence, even if the values either side are the same?
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
September 16, 2011 at 5:21 am
Thanks to those who reminded me that this is ultimately just a gaps & island problem.
In reality I was only interested in Fridays so all the dates are 7 days apart.
The trick is to generate a reference table of all possible dates together with a sequential row_number and then join that to my actual table.
Please see attachment.
Any performance improvements appreciated 🙂
Lotusnotes
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply