August 19, 2015 at 2:03 pm
Hi, I work for a large transit agency and would like to know if this query is possible (in order to make a certain report for managers).
If I start out with this table:
CREATE TABLE Test1
(
interval int NULL,
interval_prime int NULL,
cnt int NULL
)
with this data in it:
insert into test1 (interval, interval_prime, cnt)
values (1, 1, 1)
insert into test1 (interval, interval_prime, cnt)
values (2, null, 1)
insert into test1 (interval, interval_prime, cnt)
values (3, null, 1)
insert into test1 (interval, interval_prime, cnt)
values (4, 2, 1)
insert into test1 (interval, interval_prime, cnt)
values (5, null, 1)
insert into test1 (interval, interval_prime, cnt)
values (6, 5, 1)
insert into test1 (interval, interval_prime, cnt)
values (7, null, 1)
insert into test1 (interval, interval_prime, cnt)
values (8, null, 1)
insert into test1 (interval, interval_prime, cnt)
values (9, 7, 1)
insert into test1 (interval, interval_prime, cnt)
values (10, 10, 1)
insert into test1 (interval, interval_prime, cnt)
values (11, 11, 1)
insert into test1 (interval, interval_prime, cnt)
values (12, null, 1)
insert into test1 (interval, interval_prime, cnt)
values (13, null, 1)
insert into test1 (interval, interval_prime, cnt)
values (14, null, 1)
insert into test1 (interval, interval_prime, cnt)
values (15, null, 1)
insert into test1 (interval, interval_prime, cnt)
values (16, null, 1)
insert into test1 (interval, interval_prime, cnt)
values (17, 12, 1)
insert into test1 (interval, interval_prime, cnt)
values (18, 18, 1)
and I'd like to write a query that returns this table:
CREATE TABLE Test2
(
interval int NULL,
cnt int NULL
)
with this data in it:
insert into test2 (interval, cnt)
values (1, 1)
insert into test2 (interval, cnt)
values (2, null)
insert into test2 (interval, cnt)
values (3, null)
insert into test2 (interval, cnt)
values (4, 3)
insert into test2 (interval, cnt)
values (5, null)
insert into test2 (interval, cnt)
values (6, 2)
insert into test2 (interval, cnt)
values (7, null)
insert into test2 (interval, cnt)
values (8, null)
insert into test2 (interval, cnt)
values (9, 3)
insert into test2 (interval, cnt)
values (10, 1)
insert into test2 (interval, cnt)
values (11, 1)
insert into test2 (interval, cnt)
values (12, null)
insert into test2 (interval, cnt)
values (13, null)
insert into test2 (interval, cnt)
values (14, null)
insert into test2 (interval, cnt)
values (15, null)
insert into test2 (interval, cnt)
values (16, null)
insert into test2 (interval, cnt)
values (17, 6)
insert into test2 (interval, cnt)
values (18, 1)
The rule is that the interval list is preserved, intervals which have a null interval_prime stay as Zero or Null, but for intervals that have a value in interval_prime, the data is grouped (from interval_prime to interval).
I hope this is not written in a confusing manner!!
Thanks for any help you can give,
Martin
August 19, 2015 at 2:54 pm
The following works in SQL 2012 & 2014, but I'm not certain about 2008 (or R2). Unfortunately, I don't have 2008 to test with.
(So... Sorry in advance if it doesn't work in 2008.)
IF OBJECT_ID('tempdb..#Test1') IS NOT NULL
DROP TABLE #Test1;
CREATE TABLE #Test1
(
interval int NULL,
interval_prime int NULL,
cnt int NULL
);
IF OBJECT_ID('tempdb..#Test2') IS NOT NULL
DROP TABLE #Test2;
CREATE TABLE #Test2 (
interval int NULL,
cnt int NULL
);
INSERT #Test1 (interval,interval_prime,cnt) VALUES
(1, 1, 1), (2, null, 1), (3, null, 1), (4, 1, 1), (5, null, 1), (6, 4, 1),
(7, null, 1), (8, null, 1), (9, 6, 1), (10, 10, 1), (11, 11, 1), (12, null, 1),
(13, null, 1), (14, null, 1), (15, null, 1), (16, null, 1), (17, 12, 1), (18, 18, 1);
insert into #Test2 (interval, cnt) VALUES
(1, 1), (2, null), (3, null), (4, 3), (5, null), (6, 2), (7, null), (8, null), (9, 3),
(10, 1), (11, 1), (12, null), (13, null), (14, null), (15, null), (16, null), (17, 6), (18, 1);
--SELECT * FROM #Test1 t1;
--SELECT * FROM #Test2 t2;
WITH IntervalGroup AS (
SELECT
CAST(SUBSTRING(MIN(CAST(t1.interval AS BINARY(4)) + CAST(t1.interval_prime AS BINARY(4))) OVER (ORDER BY t1.interval DESC), 1, 4) AS INT) AS IntervalGroup,
t1.cnt
FROM
#Test1 t1
), IntervalRollup AS (
SELECT
ig.IntervalGroup,
SUM(ig.cnt) AS InvervalGroupCount
FROM
IntervalGroup ig
GROUP BY
ig.IntervalGroup
)
SELECT
t1.interval,
t1.interval_prime,
t1.cnt,
ir.InvervalGroupCount
FROM
#Test1 t1
LEFT JOIN IntervalRollup ir
ON t1.interval = ir.IntervalGroup;
Here are the results... (tested using 2014)
interval interval_prime cnt InvervalGroupCount
----------- -------------- ----------- ------------------
1 1 1 1
2 NULL 1 NULL
3 NULL 1 NULL
4 1 1 3
5 NULL 1 NULL
6 4 1 2
7 NULL 1 NULL
8 NULL 1 NULL
9 6 1 3
10 10 1 1
11 11 1 1
12 NULL 1 NULL
13 NULL 1 NULL
14 NULL 1 NULL
15 NULL 1 NULL
16 NULL 1 NULL
17 12 1 6
18 18 1 1
HTH,
Jason
August 19, 2015 at 3:02 pm
For a 2008-friendly version (as Jason indicated, his won't work prior to 2012, because ORDER BY was not supported for aggregate window functions yet; it only worked for ranking window functions):
SELECT OuterReference.Interval,
cnt=SUM(InnerReference.cnt)
FROM test1 OuterReference LEFT JOIN test1 InnerReference
ON InnerReference.interval BETWEEN OuterReference.interval_prime AND OuterReference.interval
GROUP BY OuterReference.interval
Cheers!
EDIT: I added the parenthetical explaining why Jason correctly suspected his required 2012+.
August 19, 2015 at 3:29 pm
Jacob Wilkins (8/19/2015)
For a 2008-friendly version (as Jason indicated, his won't work prior to 2012, because ORDER BY was not supported for aggregate window functions yet; it only worked for ranking window functions):
SELECT OuterReference.Interval,
cnt=SUM(InnerReference.cnt)
FROM test1 OuterReference LEFT JOIN test1 InnerReference
ON InnerReference.interval BETWEEN OuterReference.interval_prime AND OuterReference.interval
GROUP BY OuterReference.interval
Cheers!
EDIT: I added the parenthetical explaining why Jason correctly suspected his required 2012+.
Very nice solution! I didn't even notice the relationship between interval & interval_prime.
I did notice 2 things...
1) Now that I see the pattern, the interval_prime value on row 17 appears to be set incorrectly. If it's following the pattern for the rest of the data, it should be 11 not 12.
2) The use of the BETWEEN is causing an over count on anything greater than 1. The following should be a simple fix...
CASE WHEN SUM(InnerReference.cnt) = 1 THEN 1 ELSE SUM(InnerReference.cnt) -1 END
August 19, 2015 at 3:46 pm
Hmmm...I'm not sure I follow. The results of my original query exactly match the contents of test2, his desired results, with his sample data.
It looks like your sample data changed some of the interval_prime values from his without correctly changing the corresponding rows in the desired output table (specifically for intervals 4,6, and 9).
I may be missing something, but I'm not sure what yet 🙂
Cheers!
August 19, 2015 at 4:14 pm
Jacob Wilkins (8/19/2015)
Hmmm...I'm not sure I follow. The results of my original query exactly match the contents of test2, his desired results, with his sample data.It looks like your sample data changed some of the interval_prime values from his without correctly changing the corresponding rows in the desired output table (specifically for intervals 4,6, and 9).
I may be missing something, but I'm not sure what yet 🙂
Cheers!
I think we have a shiesty OP who edited the data after I grabbed my copy (and before you grabbed yours).
August 19, 2015 at 4:16 pm
Jason A. Long (8/19/2015)
Jacob Wilkins (8/19/2015)
Hmmm...I'm not sure I follow. The results of my original query exactly match the contents of test2, his desired results, with his sample data.It looks like your sample data changed some of the interval_prime values from his without correctly changing the corresponding rows in the desired output table (specifically for intervals 4,6, and 9).
I may be missing something, but I'm not sure what yet 🙂
Cheers!
I think we have a shiesty OP who edited the data after I grabbed my copy (and before you grabbed yours).
Ah, excellent point! I just noticed that the original post was edited. I did think it was weird that you would change the values for the sample data. This explanation makes a lot more sense 🙂
August 19, 2015 at 4:29 pm
I apologize for the sheistiness... I did indeed edit the OP because I had a couple of the values incorrect. The [interval_prime field] should indicate the first interval of the group, while [interval] field should indicate the last interval of the group. So the way I have it now is the way the data sits.
Also, I wasn't able to run Jason's answer successfully. My sample data is in SQL server, but the actual query is being written in Oracle. I'm going to try Jacob's answer, and if it works I'll adapt it for oracle.
Thanks for the help so far!
(I guess it would help if I had my example correct from the beginning...)
August 20, 2015 at 10:47 am
No worries. 🙂
In any case Jacob's solution is what you're looking for. It should work in pretty much any RDBMS.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply