August 2, 2017 at 4:08 am
Hi,
Basically I have the following interesting problem:
I have machine that record transactions, these machines can be placed in different locations. I am trying to get a report that shows the utilisation per machine per position that they were in (and per day - whereby it can be assumed that a machine will be placed in a particular position for an entire day)
S
CREATE TABLE dbo.MachinePositions
(MachineName nvarchar(100), GeoPosition nvarchar(100), EffectiveDate datetime)
INSERT INTO dbo.MachinePositions (MachineName,GeoPosition,EffectiveDate) VALUES ('Machine1', 'GeoPos789', '2017-05-01 00:00:00')
INSERT INTO dbo.MachinePositions (MachineName,GeoPosition,EffectiveDate) VALUES ('Machine1', 'GeoPos145', '2017-05-23 00:00:00')
INSERT INTO dbo.MachinePositions (MachineName,GeoPosition,EffectiveDate) VALUES ('Machine1', 'GeoPos002', '2017-06-12 00:00:00')
INSERT INTO dbo.MachinePositions (MachineName,GeoPosition,EffectiveDate) VALUES ('Machine2', 'GeoPos478', '2017-05-01 00:00:00')
INSERT INTO dbo.MachinePositions (MachineName,GeoPosition,EffectiveDate) VALUES ('Machine3', 'GeoPos788', '2017-05-01 00:00:00')
INSERT INTO dbo.MachinePositions (MachineName,GeoPosition,EffectiveDate) VALUES ('Machine3', 'GeoPos144', '2017-05-23 00:00:00')
INSERT INTO dbo.MachinePositions (MachineName,GeoPosition,EffectiveDate) VALUES ('Machine1', 'GeoPos741', '2017-12-23 00:00:00')
INSERT INTO dbo.MachinePositions (MachineName,GeoPosition,EffectiveDate) VALUES ('Machine2', 'GeoPos002', '2017-12-23 00:00:00')
CREATE TABLE dbo.MachineUtilisation
(MachineName nvarchar(100), UtilisationID INT, UtilisationDateTime datetime)
INSERT INTO dbo.MachineUtilisation (MachineName,UtilisationID,UtilisationDateTime) VALUES ('Machine1', 756, '2017-05-02 00:00:00')
INSERT INTO dbo.MachineUtilisation (MachineName,UtilisationID,UtilisationDateTime) VALUES ('Machine1', 412, '2017-05-05 00:00:00')
INSERT INTO dbo.MachineUtilisation (MachineName,UtilisationID,UtilisationDateTime) VALUES ('Machine1', 1452, '2017-05-06 00:00:00')
INSERT INTO dbo.MachineUtilisation (MachineName,UtilisationID,UtilisationDateTime) VALUES ('Machine1', 754, '2017-05-26 00:00:00')
INSERT INTO dbo.MachineUtilisation (MachineName,UtilisationID,UtilisationDateTime) VALUES ('Machine1', 1111, '2017-05-27 00:00:00')
INSERT INTO dbo.MachineUtilisation (MachineName,UtilisationID,UtilisationDateTime) VALUES ('Machine1', 1243, '2017-06-22 00:00:00')
INSERT INTO dbo.MachineUtilisation (MachineName,UtilisationID,UtilisationDateTime) VALUES ('Machine1', 1478, '2017-06-23 00:00:00')
INSERT INTO dbo.MachineUtilisation (MachineName,UtilisationID,UtilisationDateTime) VALUES ('Machine2', 1459, '2017-05-07 00:00:00')
INSERT INTO dbo.MachineUtilisation (MachineName,UtilisationID,UtilisationDateTime) VALUES ('Machine2', 1254, '2017-05-08 00:00:00')
INSERT INTO dbo.MachineUtilisation (MachineName,UtilisationID,UtilisationDateTime) VALUES ('Machine2', 1256, '2017-05-09 00:00:00')
INSERT INTO dbo.MachineUtilisation (MachineName,UtilisationID,UtilisationDateTime) VALUES ('Machine3', 1253, '2017-05-02 00:00:00')
INSERT INTO dbo.MachineUtilisation (MachineName,UtilisationID,UtilisationDateTime) VALUES ('Machine3', 1251, '2017-05-12 00:00:00')
INSERT INTO dbo.MachineUtilisation (MachineName,UtilisationID,UtilisationDateTime) VALUES ('Machine3', 8569, '2017-05-12 00:00:00')
INSERT INTO dbo.MachineUtilisation (MachineName,UtilisationID,UtilisationDateTime) VALUES ('Machine3', 3652, '2017-05-12 00:00:00')
INSERT INTO dbo.MachineUtilisation (MachineName,UtilisationID,UtilisationDateTime) VALUES ('Machine3', 9874, '2017-05-25 00:00:00')
--- Desired Output
/*
MachineName GeoPosition NumberOfUtilisations
Machine1 GeoPos789 3
Machine1 GeoPos145 2
Machine1 GeoPos002 2
Machine2 GeoPos478 3
Machine3 GeoPos788 4
Machine3 GeoPos144 1
*/
So I am trying to get the correct position of the machine for a particular transactions but I just cant figure out the code ..I have tried using the effectivedate in the join clause but then I seem to simply getting the latest position. The thing is that it is possible to future date some of those positions - e.g. I know that I will move the machine to emplacment XYZ for December 2017 , I could already put that record in the MachinePositions table; the last 2 records in the example code do just that (swapping the emplacement for Machine 1 and 2).
I hope someone has experience in this and can provide me with a bunch of pointers!
Thanks,
B
August 2, 2017 at 4:19 am
One way is by using the LEAD function, to get the value of the next Utilisation inside a CTE. Then you can JOIN on EffectiveDate and EndDate. Thus:WITH Positions AS (
SELECT MachineName, GeoPosition, EffectiveDate,
LEAD(EffectiveDate,1, GETDATE()) OVER (PARTITION BY MachineName ORDER BY EffectiveDate) AS EndDate
FROM MachinePositions)
SELECT MU.MachineName,
P.GeoPosition,
COUNT(*) AS Utilisations
FROM MachineUtilisation MU
JOIN Positions P ON MU.MachineName = P.MachineName
AND MU.UtilisationDateTime >= P.EffectiveDate AND MU.UtilisationDateTime < EndDate
GROUP BY MU.MachineName, P.GeoPosition
ORDER BY MachineName, P.GeoPosition DESC;
If you have any questions on how this works, please do ask.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
August 2, 2017 at 4:36 am
Thanks Thom!
I tried your approach on my actual data set and that seems to work great!
I see you went straight for creating an enddate in the CTE .. when I defined my tables, every time I have to deal with a transient thing (Slowly changing dimensions and the likes) I'm not sure which approach I prefer .. StartDate/EnDate (which I usually choose), or simply one column stating EffectiveDate ... I'm still unsure to be honest :p
August 2, 2017 at 5:02 am
bleroy - Wednesday, August 2, 2017 4:36 AMThanks Thom!
I tried your approach on my actual data set and that seems to work great!
I see you went straight for creating an enddate in the CTE .. when I defined my tables, every time I have to deal with a transient thing (Slowly changing dimensions and the likes) I'm not sure which approach I prefer .. StartDate/EnDate (which I usually choose), or simply one column stating EffectiveDate ... I'm still unsure to be honest :p
Generally I prefer having Start/End Dates as well. Writing queries is "easier" (not CTE and LEAD),but you do need to ensure that you have no "gaps" in your data. Not having the EndDate still makes it possible though. A quick test on your sample data, adding an EndDate column, showed that the two distinct columns does have a lower cost (for your reference):ALTER TABLE MachinePositions ADD EndDate datetime;
GO
UPDATE MachinePositions
SET EndDate = (SELECT TOP 1 sq.EffectiveDate
FROM MachinePositions sq
WHERE sq.MachineName = MachinePositions.MachineName
AND sq.EffectiveDate > MachinePositions.EffectiveDate)
GO
--Original Query
WITH Positions AS (
SELECT MachineName, GeoPosition, EffectiveDate,
LEAD(EffectiveDate,1, GETDATE()) OVER (PARTITION BY MachineName ORDER BY EffectiveDate) AS EndDate
FROM MachinePositions)
SELECT MU.MachineName,
P.GeoPosition,
COUNT(*) AS Utilisations
FROM MachineUtilisation MU
JOIN Positions P ON MU.MachineName = P.MachineName
AND MU.UtilisationDateTime >= P.EffectiveDate AND MU.UtilisationDateTime < EndDate
GROUP BY MU.MachineName, P.GeoPosition
ORDER BY MachineName, P.GeoPosition DESC;
GO
--Using new Column
SELECT MU.MachineName,
MP.GeoPosition,
COUNT(*) AS Utilisations
FROM MachineUtilisation MU
JOIN MachinePositions MP ON MU.MachineName = MP.MachineName
AND MU.UtilisationDateTime >= MP.EffectiveDate AND (MU.UtilisationDateTime < MP.EndDate OR MP.EndDate IS NULL)
GROUP BY MU.MachineName, MP.GeoPosition
ORDER BY MachineName, MP.GeoPosition DESC;
The SQL using LEAD came in consistently at 61% cost relative to the batch, where as the SQL with the new column only cost 39%. So a good 20% difference.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply