March 13, 2014 at 9:15 am
Afternoon, looking to pick some brains here to what I think is a grouping problem. I hope my explanation isn't as muddy as i thought it was when trying to type it.
To give you some context we have a new amendments application (nothing fancy, excel based with SQL Server back end) that allows users to submit amendments to product data (Product Info, PO Prices, Dates etc.). There is also an admin tool that a team uses to action these amendments in the various systems.
The old version of this tool, users submitted amendments by style and could if need be submit multiple amendments against one product at the same time. The new tool, I believe for audit reasons, users submit by amendment type, so for example I would submit a cost price change for a given style.
The issue now is that on the occasions where a user has multiple amendments, they now come through separately. So cost price would be Amendment 1 and a date change would be amendment 2 even though they could be the same product. This could potentially mean that the admin team would be duplicating work if the paperwork is updated and sent after each amendment, whereas before they would make both changes and only send the paperwork once.
Having not built either of these tools, I've been tasked with trying to fix this, my two thoughts being either to amend the user form to somehow capture/allow users to submit amendments together or try to use the existing data and doing the grouping dynamically in the back end. At the moment I'm trying to look at the latter, as preferably I don’t want to have to do too much on the tool itself.
In my mind I'm looking at putting in a time delay of a couple of hours, so amendments do not appear in the admin tool right away.
Use that lag to look at grouping any submitted amendments that occur within 30mins of the first occurrence of that style
This grouping would then be given a joint time so when the 'time lag' period passes the amendments will be visible together.
I've tried a few things and a few head on desk moments trying to get a set based approach but haven't been able to get where i want, its either an issue where amendments span an hour, such as 9:59 and then 10:03 or grouping together amendments that happen after the 30mins of the first one.
Here is some sample data
USE FF_Winning_Together;
IF OBJECT_ID(N'tempdb..#AmendTest',N'U') IS NOT NULL
DROP TABLE #AmendTest;
CREATE TABLE #AmendTest
(
AmendmentIDINT IDENTITY(1,1)NOT NULL,
StyleCHAR(1)NOT NULL,
AmendmentStatusVARCHAR(10)NOT NULL,
DTDATETIMENOT NULL
)
INSERT INTO #AmendTest (Style,AmendmentStatus,DT)
VALUES('C','Submitted','2014-03-13 08:53:37.110'),
('F','Submitted','2014-03-13 08:54:00.000'),
('A','Submitted','2014-03-13 08:55:37.110'),
('C','Submitted','2014-03-13 08:59:27.323'),
('B','Submitted','2014-03-13 09:03:37.110'),
('B','Submitted','2014-03-13 09:04:46.013'),
('C','Submitted','2014-03-13 09:04:50.230'),
('A','Submitted','2014-03-13 09:05:46.013'),
('B','Submitted','2014-03-13 09:05:50.230'),
('D','Submitted','2014-03-13 09:23:37.110'),
('D','Submitted','2014-03-13 09:24:46.013'),
('D','Submitted','2014-03-13 09:29:27.323'),
('D','Submitted','2014-03-13 09:34:50.230'),
('A','Submitted','2014-03-13 09:56:00.323'),
('A','Submitted','2014-03-13 10:06:50.230'),
('B','Submitted','2014-03-13 11:09:27.323');
select * from #AmendTest
order by Style, dt;
-- one of my attempts
WITH DeriveTimeGroup (Style,StyleCnt,MinDt,MaxDt)
AS
(
SELECTP.Style,
COUNT(P.Style),
MIN(P.DT),
DATEADD(MINUTE,30,MIN(P.DT))
--,DATEADD(MINUTE,(DATEDIFF(MINUTE,0,AT.DT)/30)*30,0))
--DATEADD(MINUTE,(DATEDIFF(MINUTE,0,AT.DT)/30)*30+30,0))
FROM#AmendTest AS P
GROUP
BYP.Style
)
SELECTP.AmendmentID,
P.Style,
P.DT,
D.StyleCnt,
D.MinDt,
D.MaxDt,
CASE
WHEN D.StyleCnt = 1 THEN 0
WHEN P.DT >= D.MinDt AND P.DT <= D.MaxDt THEN 1
ELSE 0
END AS Grp,
CASE
WHEN P.DT > D.MaxDt THEN DATEADD(MINUTE,120,P.DT)
ELSE DATEADD(MINUTE,120,D.MaxDt)
END AS ViewTime
FROM#AmendTest AS P
INNER
JOINDeriveTimeGroup AS D
ONP.Style = D.Style
WHEREP.DT <= DATEADD(MINUTE,-120,GETDATE())
ORDER
BYP.Style,
P.DT;
so using this data, they would be grouped as follows:
A 3,8
A 13,14
B 5,6,9
B 16
C 1,4,7
D 10,11,12,13
F 2
Any help welcome and if any more detail or clarification happy to add.
March 13, 2014 at 10:53 am
Think I'm getting closer:
WITH GroupTime (Style,GrpDT)
AS
(
SELECTAT.Style,
DATEADD(HOUR,DATEDIFF(HOUR,0,DATEADD(MINUTE,30,AT.DT)),0)
FROM#AmendTest AS AT
GROUP
BYAT.Style,
DATEADD(HOUR,DATEDIFF(HOUR,0,DATEADD(MINUTE,30,AT.DT)),0)
),
DeriveNew (AmendmentID,Style,GrpDT,GrpNum)
AS
(
SELECTAT.AmendmentID,
AT.Style,
CA1.GrpDT,
DENSE_RANK() OVER (PARTITION BY AT.Style ORDER BY CA1.GrpDT ASC)
FROM#AmendTest AS AT
CROSS APPLY (SELECT DATEADD(HOUR,DATEDIFF(HOUR,0,DATEADD(MINUTE,30,AT.DT)),0)) AS CA1(GrpDT)
INNER
JOINGroupTime GT
ONAT.Style = GT.Style
AND CA1.GrpDT = GT.GrpDT
),
StyleCnt (Style,GrpNum,GrpCnt)
AS
(
SELECTDN.Style,
DN.GrpNum,
COUNT(DN.GrpNum)
FROMDeriveNew AS DN
GROUP
BYDN.Style,
DN.GrpNum
)
SELECTDN.AmendmentID,
DN.Style,
DN.GrpDT,
DN.GrpNum,
SC.GrpCnt,
--If only one Style group as 0
CASE
WHEN SC.GrpCnt = 1 THEN 0
ELSE DN.GrpNum
END AS Chuff
FROMDeriveNew AS DN
INNER
JOINStyleCnt AS SC
ONDN.Style = SC.Style
AND DN.GrpNum = SC.GrpNum
March 13, 2014 at 11:22 am
This is certainly ugly, I might come back to improve this query.
SELECT at.Style,
STUFF((SELECT ',' + CAST( x.AmendmentID AS varchar(10))
FROM #AmendTest x
WHERE at.Style = x.Style
AND x.DT BETWEEN at.DT AND DATEADD(MI, 30, at.DT)
ORDER BY x.AmendmentID
FOR XML PATH('')), 1,1, '')
FROM #AmendTest at
JOIN #AmendTest at2 ON at.Style = at2.Style AND at2.DT BETWEEN at.DT AND DATEADD(MI, 30, at.DT)
WHERE at.AmendmentID NOT IN(
SELECT b.AmendmentID
FROM #AmendTest a
JOIN #AmendTest b ON a.Style = b.Style AND b.DT > a.DT AND b.DT <= DATEADD(MI, 30, a.DT))
GROUP BY at.AmendmentID, at.Style, at.DT
ORDER BY at.Style, at.DT
Should you group within 30 minutes from the first AmendmentID? or the previous one?
March 13, 2014 at 6:47 pm
Perhaps this is not quite so ugly:
SELECT Style, GroupedAmendmentIDs
FROM
(
SELECT Style
,GroupedAmendmentIDs=STUFF((
SELECT ',' + CAST(AmendmentID AS VARCHAR(5))
FROM #AmendTest b
WHERE a.Style = b.Style AND 30 >= ABS(DATEDIFF(minute, a.DT, b.DT))
FOR XML PATH('')
), 1, 1, '')
FROM #AmendTest a
) a
GROUP BY Style, GroupedAmendmentIDs
ORDER BY Style, GroupedAmendmentIDs;
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
March 14, 2014 at 2:21 am
Thanks guys I'll take a look at these this morning.
The 'requirements' as such aren't set in concrete, 30mins was just a finger in the air estimate on my behalf of what a suitable time would be to allow someone to submit multiple amendments, similar with the length of the time lag to stop them showing before they've had a change to be grouped.
I actually wasn't specifically looking to concatenate the IDs together, more just highlighting how they would be grouped together, although the concatenation part has got me looking at this in a different way!
March 14, 2014 at 2:26 am
And to answer luis's question I was originally basing it on 30mins from the first amendment, although again had loosely thought about resetting it and taking it as 30mins from the previous one
March 20, 2014 at 6:20 am
Right, been able to spend a bit more time on this and revisit and the good news is seems I have something workable uses Luis's code
IF OBJECT_ID(N'dbo.VW_AmendmentsGroupTest1',N'V') IS NOT NULL
DROP VIEW dbo.VW_AmendmentsGroupTest1;
GO
CREATE VIEW dbo.VW_AmendmentsGroupTest1 WITH SCHEMABINDING
AS
WITH BaseGroup (Product_Ref,Submitted_Timestamp,ViewTime,GroupedIDs)
AS
(
SELECTAT.Product_Ref,
AT.Submitted_Timestamp,
--Add hour to help calculate a lag/delay so data cannot be viewed before its grouped
DATEADD(HOUR,1,AT.Submitted_Timestamp),
STUFF((
SELECT ',' + CAST( x.Amendment_ID AS varchar(10))
FROM Amendments.Amendments_Header x
WHEREAT.Product_Ref = x.Product_Ref
AND x.Submitted_Timestamp BETWEEN at.Submitted_Timestamp AND DATEADD(MI, 30, at.Submitted_Timestamp)
ORDER BY x.Amendment_ID
FOR XML PATH('')
), 1,1, '')
FROMAmendments.Amendments_Header AS AT
INNER
JOINAmendments.Amendments_Header AS AT2
ONAT.Product_Ref = AT2.Product_Ref
AND AT2.Submitted_Timestamp BETWEEN at.Submitted_Timestamp AND DATEADD(MI, 30, at.Submitted_Timestamp)
WHEREAT.Amendment_ID NOT IN (
SELECTB.Amendment_ID
FROMAmendments.Amendments_Header AS A
INNER
JOINAmendments.Amendments_Header AS B
ONA.Product_Ref = B.Product_Ref
AND B.Submitted_Timestamp > A.Submitted_Timestamp
AND B.Submitted_Timestamp <= DATEADD(MI, 30, A.Submitted_Timestamp)
)
GROUP
BYAT.Amendment_ID,
AT.Product_Ref,
AT.Submitted_Timestamp
)
--re-split out the delimited IDs and determine by count if it is a multiple or not
SELECTBG.Product_Ref,
--round the time
DATEADD(HOUR,DATEDIFF(HOUR,0,DATEADD(MI,30,BG.ViewTime)),0) AS ViewTime,
CA1.Item AS AmdendmentID,
CASE
WHEN COUNT(*) OVER (PARTITION BY BG.Product_Ref, BG.ViewTime) > 1 THEN 'Y'
ELSE 'N'
END AS Multiple
FROMBaseGroup AS BG
CROSS
APPLYdbo.DelimitedSplit8K(BG.GroupedIDs,',') AS CA1;
GO
With this view I can take an amendment ID and see if it is a multiple, if it is return the product and grouped time to then return the rest of the grouped IDs. Something along the lines of:
DECLARE@ProductRef VARCHAR(10),
@dt DATETIME,
@id int = 2544;
SELECT@ProductRef = A.Product_Ref,
@dt = A.ViewTime
FROMdbo.VW_AmendmentsGroupTest1 AS A
WHEREA.AmdendmentID = @id;
SELECTB.AmdendmentID
FROMdbo.VW_AmendmentsGroupTest1 AS B
WHEREB.Product_Ref = @ProductRef
AND B.ViewTime = @dt;
One small data issue I'm still having is if the submitted amendments span over a time greater than 1/2 hour its not returning all the IDs.
Here's an example:
IF OBJECT_ID(N'tempdb..#GroupTest',N'U') IS NOT NULL
DROP TABLE #GroupTest;
CREATE TABLE #GroupTest
(
Product VARCHAR(10) NOT NULL,
ID INT NOT NULL,
DT DATETIME NOT NULL
)
INSERT INTO #GroupTest
VALUES('CT42203599',2382,'2014-03-10 15:06:00.980'),
('CT42203599',2384,'2014-03-10 15:19:09.590'),
('CT42203599',2385,'2014-03-10 15:19:57.450'),
('CT42203599',2386,'2014-03-10 15:21:26.013'),
('CT42203599',2387,'2014-03-10 15:29:56.370'),
('CT42203599',2388,'2014-03-10 15:35:18.197'),
('CT42203599',2389,'2014-03-10 15:36:57.200'),
('CT42203599',2390,'2014-03-10 15:37:57.900'),
('CT42203599',2391,'2014-03-10 15:39:41.087'),
('CT42203599',2392,'2014-03-10 15:56:20.067'),
('CT42203599',2393,'2014-03-10 15:59:51.007'),
('CT42203599',2394,'2014-03-10 16:01:00.573'),
('CT42203599',2395,'2014-03-10 16:01:49.550'),
('CT42203599',2396,'2014-03-10 16:03:23.453')
--Should be 2 groups 2382--2388 and 2389-2396; however, 2389+ are being ignored completely
--as 2389 is within 30mins of some of the values in the 2382-2388 range
SELECTAT.Product,
AT.DT,
--AT2.ID,
DATEADD(HOUR,1,AT.DT)
,STUFF((
SELECT ',' + CAST( x.Amendment_ID AS varchar(10))
FROM Amendments.Amendments_Header x
WHEREAT.Product = x.Product_Ref
AND x.Submitted_Timestamp BETWEEN at.DT AND DATEADD(MI, 30, at.DT)
ORDER BY x.Amendment_ID
FOR XML PATH('')
), 1,1, '')
FROM#GroupTest AS AT
INNER
JOIN#GroupTest AS AT2
ONAT.Product = AT2.Product
AND AT2.DT BETWEEN at.DT AND DATEADD(MI, 30, at.DT)
WHEREAT.ID NOT IN (
SELECTB.ID--A.*,B.ID, B.*
FROM#GroupTest AS A
INNER
JOIN#GroupTest AS B
ONA.Product = B.Product
AND B.DT > A.DT
AND B.DT <= DATEADD(MI, 30, A.DT)
)
GROUP
BYAT.ID,
AT.Product,
AT.DT
Can't seem to get the second lot to show correctly, so any ideas...
March 21, 2014 at 9:17 am
Think I finally solved this, went with a different tact started by finding the Min and Max dates by product by day and using the numbers function to produce 30min intervals from the min to max dates. Seems to do the business for this purpose, still needs some more thorough checks
Anyone that is interested here is the code
WITH GroupBaseData (OptionID,DTDay,MinDT,MaxDT)
AS
(
SELECT--GT.ID,
GT.OptionID,
ca1.DTDay,
--CA1.DTDay,
--Find min and max dates by day to get range values
MIN(GT.DT),
MAX(CA2.DT1)
--,1440 - MIN(CA3.Mins)
FROMdbo.GroupTest AS GT
--Find day start to group on
CROSS
APPLY(SELECTDATEADD(DAY,DATEDIFF(DAY,0,GT.DT),0)) AS CA1(DTDay)
CROSS
APPLY(SELECTDATEADD(MI,1,GT.DT)) AS CA2(DT1)
--CROSS
--APPLY(SELECT DATEDIFF(MI,CA1.DTday,CA2.DT)/30*30) AS CA3(Mins)
GROUP
BYGT.OptionID,
CA1.DTDay
),
GroupIntervals
AS
(
SELECTGB.OptionID,
GB.DTDay,
GB.MinDT,
GB.MaxDT,
CA1.StartInterval,
CA2.EndInterval,
ROW_NUMBER() OVER (PARTITION BY GB.optionID,GB.DTDay ORDER BY CA1.StartInterval ASC) AS RN
FROMGroupBaseData AS GB
--1440mins in a day, so 48 30min intervals
CROSS APPLY dbo.GetNums(0,48) AS GN
CROSS APPLY (selectDATEADD(mi,GN.N * 30,mindt)) AS CA1(StartInterval) --Use Min date as start point and increase in 30mins intervals
CROSS APPLY (select DATEADD(mi,(GN.N + 1) * 30,mindt)) AS CA2(EndInterval) --End of 30min Interval
WHERE --(N * 30) <= bd.mins
--OptionID = 'f'
CA2.EndInterval <= (GB.MaxDT + 30)
)
SELECTGT.ID,
--GT.DT,
GT.OptionID,
--GI.StartInterval,
--GI.EndInterval,
--GI.RN,
--DATEADD(HOUR,1,GI.StartInterval),
CASE
WHEN COUNT(CA2.Viewtime) OVER (PARTITION BY GT.OptionID, GI.RN) > 1 then 'Y'
ELSE 'N'
END AS Multiple,
CA2.Viewtime
FROMGroupTest as gt
CROSS
APPLY(SELECTDATEADD(DAY,DATEDIFF(DAY,0,GT.DT),0)) AS CA1(DTDay)
INNER
JOINGroupIntervals AS GI
ONGT.OptionID = GI.OptionID
AND CA1.DTDay = GI.DTDay
CROSS
APPLY(SELECT DATEADD(MINUTE,DATEDIFF(MINUTE,0,DATEADD(HOUR,1,GI.StartInterval)) / 30 * 30, 0)) AS CA2(Viewtime)
WHEREGT.DT >= GI.StartInterval
ANDGT.DT < GI.EndInterval;
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply