May 20, 2010 at 3:51 pm
Hello
I'm trying to figure out the magic SQL to produce a tricky aggregation. Here's what the source table looks like...
CREATE TABLE [dbo].[Items](
[ID] [int] NOT NULL, -- FK to another table
[int] NOT NULL,
[BEGIN] [varchar](5) NOT NULL,
[END] [int] NOT NULL,
[MULT1] [int] NOT NULL,
[MULT2] [int] NOT NULL,
[MULT3] [int] NOT NULL,
[MULT4] [int] NOT NULL
)
Here's some sample data that I'm working with...
ID CODE BEGIN END MULT1 MULT2 MULT3 MULT4
1112340 19900501207812314500450045004500
111239824820050701200609301000100010001000
111239824820061001200612310000
111239824820070101200703311000100010001000
111239824820070401200709300000
111239824820071001200803311000100010001000
111239824820080401200809300000
111239824820081001200903312000020000
111239824820090401207812312000200020002000
The BEGIN and END are the effective dates for when a CODE is valid. All of the MULTs for CODEs within an ID needed to be added up for the dates that overlap.
So using the data above, here's what the results should look like...
11123 19900501 20050700 4500 4500 4500 4500 (only CODE 40 is used)
11123 20050701 20060930 5500 5500 5500 5500 (CODE 40 and CODE 98248 added for each MULT)
11123 20061001 20061231 4500 4500 4500 4500 (CODE 40 and CODE 98248 added for each MULT)
11123 20070101 20070331 5500 5500 5500 5500 (CODE 40 and CODE 98248 added for each MULT)
11123 20070401 20070930 4500 4500 4500 4500 (CODE 40 and CODE 98248 added for each MULT)
11123 20071001 20080331 5500 5500 5500 5500 (CODE 40 and CODE 98248 added for each MULT)
11123 20080401 20080930 4500 4500 4500 4500 (CODE 40 and CODE 98248 added for each MULT)
11123 20081001 20090331 6500 4500 6500 4500 (CODE 40 and CODE 98248 added for each MULT)
11123 20090401 20781231 6500 6500 6500 6500 (CODE 40 and CODE 98248 added for each MULT)
Any ideas on the SQL to produce this result?
Thanks!
May 20, 2010 at 3:57 pm
Please read and follow the instructions given in the article referenced in the first link in my signature.
This will allow us to work on a solution instead of trying to set up some sample data to work with.
Also, please include what you've tried so far and where you got stuck.
May 20, 2010 at 4:16 pm
Something to start with.
if object_id('tempdb..#Items')is not null drop table #Items
CREATE TABLE #Items(
[ID] [int] NOT NULL, -- FK to another table
[int] NOT NULL,
[BEGIN] [int] NOT NULL, -- I think this was not supposed to be varchar(5)
[END] [int] NOT NULL,
[MULT1] [int] NOT NULL,
[MULT2] [int] NOT NULL,
[MULT3] [int] NOT NULL,
[MULT4] [int] NOT NULL
)
insert #items
select '11123','40','19900501','20781231','4500','4500','4500','4500' union all
select '11123','98248','20050701','20060930','1000','1000','1000','1000' union all
select '11123','98248','20061001','20061231','0','0','0','0' union all
select '11123','98248','20070101','20070331','1000','1000','1000','1000' union all
select '11123','98248','20070401','20070930','0','0','0','0' union all
select '11123','98248','20071001','20080331','1000','1000','1000','1000' union all
select '11123','98248','20080401','20080930','0','0','0','0' union all
select '11123','98248','20081001','20090331','2000','0','2000','0' union all
select '11123','98248','20090401','20781231','2000','2000','2000','2000'
select * from #items
May 20, 2010 at 4:19 pm
OK, trying again...
Here's some SQL to get everything setup
declare @table table(ID int,
CODE int,
[BEGIN] int,
[END] int,
MULT1 int,
MULT2 int,
MULT3 int,
MULT4 int)
insert @table
select 11123,40,19900501,20781231,4500,4500,4500,4500 union all
select 11123,98248,20050701,20060930,1000,1000,1000,1000 union all
select 11123,98248,20061001,20061231,0,0,0,0 union all
select 11123,98248,20070101,20070331,1000,1000,1000,1000 union all
select 11123,98248,20070401,20070930,0,0,0,0 union all
select 11123,98248,20071001,20080331,1000,1000,1000,1000 union all
select 11123,98248,20080401,20080930,0,0,0,0 union all
select 11123,98248,20081001,20090331,2000,0,2000,0 union all
select 11123,98248,20090401,20781231,2000,2000,2000,2000
So far I've played with trying a kind of self-join using the IDs, CODEs, BEGINs and ENDs. But I'm getting too many rows (i.e. 16 instead of 9).
Thanks!
May 20, 2010 at 5:31 pm
Found some stuff online regarding overlapping or intersecting dates. May want to check that out. I'm sure you've been looking around. Below is what I came up with trying to do as much as I could without changing the table structure.
if object_id('tempdb..#Items')is not null drop table #Items
CREATE TABLE #Items(
[ID] [int] NOT NULL, -- FK to another table
[int] NOT NULL,
[BEGIN] [int] NOT NULL, -- I think this was not supposed to be varchar(5)
[END] [int] NOT NULL,
[MULT1] [int] NOT NULL,
[MULT2] [int] NOT NULL,
[MULT3] [int] NOT NULL,
[MULT4] [int] NOT NULL
)
insert #items
select '11123','40','19900501','20781231','4500','4500','4500','4500' union all
select '11123','98248','20050701','20060930','1000','1000','1000','1000' union all
select '11123','98248','20061001','20061231','0','0','0','0' union all
select '11123','98248','20070101','20070331','1000','1000','1000','1000' union all
select '11123','98248','20070401','20070930','0','0','0','0' union all
select '11123','98248','20071001','20080331','1000','1000','1000','1000' union all
select '11123','98248','20080401','20080930','0','0','0','0' union all
select '11123','98248','20081001','20090331','2000','0','2000','0' union all
select '11123','98248','20090401','20781231','2000','2000','2000','2000'
select (cast(a.id as varchar) + cast(a.code as varchar) + cast(a.[begin] as varchar) + cast(a.[end] as varchar)),*,(cast(b.id as varchar) + cast(b.code as varchar) + cast(b.[begin] as varchar) + cast(b.[end] as varchar))
--select a.id,a.code,a.[begin],a.[end],a.mult1 + b.mult1,a.mult2 + b.mult2,a.mult3 + b.mult3,a.mult4 + b.mult4
from #items a, #items b
WHERE b.[end] >= a.[begin] and b.[begin] <= a.[end]
and (cast(a.id as varchar) + cast(a.code as varchar) + cast(a.[begin] as varchar) + cast(a.[end] as varchar)) <>
(cast(b.id as varchar) + cast(b.code as varchar) + cast(b.[begin] as varchar) + cast(b.[end] as varchar))
order by a.[begin]
Would you be able to add an identity column to your original table, if it helps? I have no idea if it will, just trying things out.
May 21, 2010 at 7:45 am
I've tried something similar without much luck. Thanks for trying!
I'm now playing around with CTEs to see if that will work. Anybody else have any suggestions?
Thanks!
May 21, 2010 at 8:00 am
cgreathouse (5/20/2010)
OK, trying again...Here's some SQL to get everything setup
declare @table table(ID int,
CODE int,
[BEGIN] int,
[END] int,
MULT1 int,
MULT2 int,
MULT3 int,
MULT4 int)
insert @table
select 11123,40,19900501,20781231,4500,4500,4500,4500 union all
select 11123,98248,20050701,20060930,1000,1000,1000,1000 union all
select 11123,98248,20061001,20061231,0,0,0,0 union all
select 11123,98248,20070101,20070331,1000,1000,1000,1000 union all
select 11123,98248,20070401,20070930,0,0,0,0 union all
select 11123,98248,20071001,20080331,1000,1000,1000,1000 union all
select 11123,98248,20080401,20080930,0,0,0,0 union all
select 11123,98248,20081001,20090331,2000,0,2000,0 union all
select 11123,98248,20090401,20781231,2000,2000,2000,2000
So far I've played with trying a kind of self-join using the IDs, CODEs, BEGINs and ENDs. But I'm getting too many rows (i.e. 16 instead of 9).
Thanks!
Try this
SELECT a.ID,a.[BEGIN],a.[END],
SUM(b.MULT1) AS MULT1,
SUM(b.MULT2) AS MULT2,
SUM(b.MULT3) AS MULT3,
SUM(b.MULT4) AS MULT4
FROM @table a
INNER JOIN @table b ON b.ID=a.ID
AND b.[BEGIN]<=a.[BEGIN] AND b.[END]>=a.[END]
GROUP BY a.ID,a.[BEGIN],a.[END]
ORDER BY a.ID,a.[BEGIN],a.[END]
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537May 21, 2010 at 8:12 am
cgreathouse (5/21/2010)
I've tried something similar without much luck. Thanks for trying!I'm now playing around with CTEs to see if that will work. Anybody else have any suggestions?
Thanks!
Yep, if you can change your date columns to a DATE type.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
May 21, 2010 at 8:14 am
That's excellent!!! Thank you!
May 21, 2010 at 10:51 am
Oops, guess I spoke too soon. It's close but there's one more thing to take care of.
The previous solution will give these results
IDBEGINENDMULT1MULT2MULT3MULT4
1112319900501207812314500450045004500 -- (Code 40)
1112320050701200609305500550055005500 -- (Code 40 & 98248)
1112320061001200612314500450045004500 -- (Code 40 & 98248)
1112320070101200703315500550055005500 -- (Code 40 & 98248)
1112320070401200709304500450045004500 -- (Code 40 & 98248)
1112320071001200803315500550055005500 -- (Code 40 & 98248)
1112320080401200809304500450045004500 -- (Code 40 & 98248)
1112320081001200903316500450065004500 -- (Code 40 & 98248)
1112320090401207812316500650065006500 -- (Code 40 & 98248)
The first row needs to look like this (different END value)...
IDBEGINENDMULT1MULT2MULT3MULT4
1112319900501200506304500450045004500 -- (Code 40)
The reason is the second row's BEGIN is 20050701. The BEGIN and END intervals for a given ID can't overlap.
Any thoughts?
Thanks!
May 21, 2010 at 11:33 am
I would suggest you first copy your data into another temp table with sorting out overlapping date ranges ad then proceed with the query.
But if you are lazzy you can try this:
;with nextMinBegin
as
(select i.ID, MIN(i.[BEGIN]) as MinBegin
from #items i
join (select m.ID, MIN(m.[BEGIN]) rmBegin
from #items m
group by m.ID
) rm on rm.ID = i.ID
where i.[BEGIN] > rm.rmBegin
group by i.ID
)
SELECT a.ID,a.[BEGIN]
,CASE WHEN a.[BEGIN] < nmb.MinBegin AND a.[END] > nmb.MinBegin
THEN cast(convert(varchar,cast(cast(nmb.MinBegin as varchar(30)) as datetime)- 1,112) as int)
ELSE a.[END] END as [END],
SUM(b.MULT1) AS MULT1,
SUM(b.MULT2) AS MULT2,
SUM(b.MULT3) AS MULT3,
SUM(b.MULT4) AS MULT4
FROM #items a
INNER JOIN #items b ON b.ID=a.ID
AND b.[BEGIN]<=a.[BEGIN] AND b.[END]>=a.[END]
LEFT JOIN nextMinBegin nmb ON nmb.ID = a.ID
GROUP BY a.ID,a.[BEGIN],CASE WHEN a.[BEGIN] < nmb.MinBegin AND a.[END] > nmb.MinBegin
THEN cast(convert(varchar,cast(cast(nmb.MinBegin as varchar(30)) as datetime)- 1,112) as int)
ELSE a.[END] END
ORDER BY a.ID,a.[BEGIN],CASE WHEN a.[BEGIN] < nmb.MinBegin AND a.[END] > nmb.MinBegin
THEN cast(convert(varchar,cast(cast(nmb.MinBegin as varchar(30)) as datetime)- 1,112) as int)
ELSE a.[END] END
I hate the above code...:Whistling:
May 21, 2010 at 11:37 am
I'm glad someone else came up with that. I was going in an entirely different direction. I ended up hooking this into a date table and was trying to pull the values after tying each one to a specific day. Although I was close, I think I'm going to wait to see if that meets the specs.
May 21, 2010 at 12:13 pm
This looks pretty good! I'll test it out on the real dataset and see how it goes. I did simplify the code a little. Here's what I'm trying out...
;with nextMinBegin
as
(select i.ID, MIN(i.[BEGIN]) as MinBegin
from @table i
join ( select m.ID, MIN(m.[BEGIN]) rmBegin
from @table m
group by m.ID
) rm on rm.ID = i.ID
where i.[BEGIN] > rm.rmBegin
group by i.ID
)
SELECT a.ID,a.[BEGIN]
,CASE WHEN a.[BEGIN] < nmb.MinBegin AND a.[END] > nmb.MinBegin
THEN cast(convert(varchar,cast(cast(nmb.MinBegin as varchar(30)) as datetime)- 1,112) as int)
ELSE a.[END] END as [END],
SUM(b.MULT1) AS MULT1,
SUM(b.MULT2) AS MULT2,
SUM(b.MULT3) AS MULT3,
SUM(b.MULT4) AS MULT4
FROM @table a
INNER JOIN @table b ON b.ID=a.ID
AND b.[BEGIN]<=a.[BEGIN] AND b.[END]>=a.[END]
LEFT JOIN nextMinBegin nmb ON nmb.ID = a.ID
GROUP BY a.ID,a.[BEGIN], nmb.MinBegin, a.[end]
I don't need the ORDER BY and I made the GROUP BY a little simpler. I'll report back...
Thanks!
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply