June 24, 2014 at 3:12 pm
a.guillaume (6/23/2014)
hi,To calculate the hours and lack of presence it takes 15 minutes
can you share the code you are running that takes 15 minutes?,,...having a better insight of what is currently taking so long will help to assist you.
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
June 25, 2014 at 1:57 pm
J Livingston SQL (6/24/2014)
a.guillaume (6/23/2014)
hi,To calculate the hours and lack of presence it takes 15 minutes
can you share the code you are running that takes 15 minutes?,,...having a better insight of what is currently taking so long will help to assist you.
personally I would prefer to have the start/end integers (minutes after midnight) as proper datetime datatypes
...not sure how you currently manage with night shifts over two days???....is this a problem for you?
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
June 26, 2014 at 6:47 am
hi
"No problem" for night shifts over two days
-> we record the hours per day
-> if someone works beyond a day, we split into 2 ranges presences (or absences)
we dispose of a table tminutes_per_day (iminute int, is_present tinyint)
insert into tminutes_per_day all the minutes of a day 0 to 1440
to calcule we use SQL like that
Truncate resource_stat_day
declare cursor for SELECT ID_resource
open;
for each resource
...
declare cursor for select day
for each day
...
EXEC calculate_resource @day, @id_resource, ...
@Nbr_minute_reel output,
@Nbr_minute_present output
insert into resource_stat_day (@day, @id_resource, @Nbr_minute_absence, @Nbr_minute_present, ...)
end for each day
end for
procedure calculate_resource
begin
update tminutes_per_day set is_present = 1 where is_present<>1;
SET @c_1 = CURSOR LOCAL FAST_FORWARD FOR
SELECT minute_start, minute_end
FROM Resource_INDISPO
WHERE id_resource = @id_resource
AND Day_indispo = @Day_calcul
OPEN @c_1;
FETCH @c_1 INTO @Min_deb, @min_fin;
WHILE @@FETCH_STATUS = 0
BEGIN
UPDATE tminutes_per_day
SET Is_present = 0
WHERE @Min_deb <= iminute
AND @Min_fin > iminute
AND Is_present <> 0;
FETCH @c_1 INTO @Min_deb, @min_fin;
END
CLOSE @c_1;
/* we obtain the presence */
SELECT @Nbr_minute_present = count(*) from tminutes_per_day
Where is_present = 1;
/* we now calculate the actual hours of attendance */
declare cursor SELECT Minute_start, a.Minute_end
FROM Resource_ABSENCE
WHERE id = @id and day = @Day
OPEN @c_1;
FETCH @c_1 INTO @Min_deb, @min_fin;
WHILE @@FETCH_STATUS = 0
BEGIN
UPDATE tminutes_per_day
SET Is_present = Is_present-1
WHERE @Min_deb <= iminute
AND @Min_fin > iminute;
FETCH @c_1 INTO @Min_deb, @min_fin;
END
CLOSE @c_1;
/* we obtain the reel presence */
SELECT @Nbr_minute_reel = count(*) from tminutes_per_day
Where is_present = 1;
end
It is a bit more complicated because of the rules given by customers but it looks like this
Regards
June 29, 2014 at 9:35 am
It is a bit more complicated because of the rules given by customers ..........
I think it would help us if you could explain the "rules" of your "application management planning " program please.....your sample data seems quite 'simple' and maybe doesn't fully describe all the possible combinations.....??
for example:
will records in Resource_ABSENCE ever be entirely within a period defined by Resource_INDISPO?
will the start or end minute records in Resource_ABSENCE exactly match the start or end minutes in Resource_INDISPO?
will the start or end minute records in Resource_ABSENCE ever overlap each other?...eg you have two 'starts' before an 'end' ...when sorted by user/date/minute_start
will start/end records in Resource_INDISPO ever overlap each other? ...ditto
working on your sample data and extrapolating to similar size tables that you say you have...then I believe significant time reduction can be gained
....but it "all depends" on your "rules" 🙂
if we know all the "rules" to begin with, then we can save us all a substantial amount of time in trying to provide you with a tried and tested solution.
Could you please provide another test setup script that covers all possibilities please?
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
June 29, 2014 at 2:33 pm
How about giving this a shot. You may also be able to run this against all ID_Resource entities at once instead of one at a time.
I'd try it in a test environment. You may need to make some mods if it isn't exactly what you need.
--Show data in sampe tables
select * from [dbo].[DAY_PRESENT];
select * from [dbo].[Resource_INDISPO];
select * from [dbo].[Resource_ABSENCE];
-- Drop Sessions2 table if it exists
if object_id('dbo.Sessions2') is not null
drop table dbo.Sessions2;
-- Create the Sessions2 table
CREATE TABLE dbo.Sessions2
(
id INT NOT NULL IDENTITY(1, 1),
ID_Resource INT NOT NULL,
Day_indispo DATE NOT NULL,
MINUTE_Start INT NOT NULL,
MINUTE_End INT NOT NULL,
CONSTRAINT PK_Sessions PRIMARY KEY(id),
CONSTRAINT CHK_endtime_gteq_starttime
CHECK (MINUTE_End >= MINUTE_Start)
);
CREATE UNIQUE INDEX idx_user_start_id ON dbo.Sessions2(ID_Resource, Day_indispo, MINUTE_Start, id);
CREATE UNIQUE INDEX idx_user_end_id ON dbo.Sessions2(ID_Resource, Day_indispo, MINUTE_End, id);
insert into dbo.Sessions2(ID_Resource,Day_indispo,MINUTE_Start,MINUTE_End)
select
ri.ID_Resource,
ri.Day_indispo,
ri.MINUTE_Start,
ri.MINUTE_End
from
[dbo].[Resource_INDISPO] ri
union all
select
ra.CODE_Resource,
ra.Day_ABSENCE,
ra.MINUTE_start,
ra.MINUTE_end
from
[dbo].[Resource_ABSENCE] ra
order by
ID_Resource,
Day_indispo,
MINUTE_Start,
MINUTE_End;
/*
The following code is modified from code written by Itzek Ben-Gan and found in the article located here:
http://sqlmag.com/blog/solutions-packing-date-and-time-intervals-puzzle
*/
-- indexes
/*
CREATE UNIQUE INDEX idx_user_start_id ON dbo.Sessions(username, starttime, id);
CREATE UNIQUE INDEX idx_user_end_id ON dbo.Sessions(username, endtime, id);
*/
WITH C1 AS
-- let e = end ordinals, let s = start ordinals
(
SELECT
s2.id,
s2.ID_Resource,
s2.Day_indispo,
s2.MINUTE_Start AS ts,
+1 AS type,
NULL AS e,
ROW_NUMBER() OVER(PARTITION BY s2.ID_Resource,s2.Day_indispo ORDER BY s2.MINUTE_Start, s2.id) AS s
FROM
dbo.Sessions2 s2
inner join dbo.DAY_PRESENT dp
on (dp.ID_Resource = s2.ID_Resource and
dp.DAY_Present = s2.Day_indispo)
UNION ALL
SELECT
s2.id,
s2.ID_Resource,
s2.Day_indispo,
s2.MINUTE_End AS ts,
-1 AS type,
ROW_NUMBER() OVER(PARTITION BY s2.ID_Resource, s2.Day_indispo ORDER BY s2.MINUTE_End, id) AS e,
NULL AS s
FROM
dbo.Sessions2 s2
inner join dbo.DAY_PRESENT dp
on (dp.ID_Resource = s2.ID_Resource and
dp.DAY_Present = s2.Day_indispo)
),
C2 AS
-- let se = start or end ordinal, namely, how many events (start or end) happened so far
(
SELECT C1.*, ROW_NUMBER() OVER(PARTITION BY ID_Resource,Day_indispo ORDER BY ts, type DESC, id) AS se
FROM C1
),
C3 AS
-- For start events, the expression s - (se - s) - 1 represents how many sessions were active
-- just before the current (hence - 1)
--
-- For end events, the expression (se - e) - e represents how many sessions are active
-- right after this one
--
-- The above two expressions are 0 exactly when a group of packed intervals
-- either starts or ends, respectively
--
-- After filtering only events when a group of packed intervals either starts or ends,
-- group each pair of adjacent start/end events
(
SELECT
ID_Resource,
Day_indispo,
ts,
FLOOR((ROW_NUMBER() OVER(PARTITION BY ID_Resource,Day_indispo ORDER BY ts) - 1) / 2 + 1) AS grpnum
FROM
C2
WHERE
COALESCE(s - (se - s) - 1, (se - e) - e) = 0
), basedata as (
SELECT
ID_Resource,
Day_indispo,
MIN(ts) AS starttime,
max(ts) AS endtime
FROM
C3
GROUP BY
ID_Resource,
Day_indispo,
grpnum
) -- select * from basedata -- Uncomment run from here up the WITH (CTE declaration) to see the independent time intervals
select
bd.ID_Resource,
bd.Day_indispo,
1440 - sum(endtime - starttime)
from
basedata bd
group by
bd.ID_Resource,
bd.Day_indispo
order by
bd.ID_Resource,
bd.Day_indispo;
July 1, 2014 at 2:45 pm
Lynn Pettis (6/29/2014)
How about giving this a shot. You may also be able to run this against all ID_Resource entities at once instead of one at a time.I'd try it in a test environment. You may need to make some mods if it isn't exactly what you need.
--Show data in sampe tables
select * from [dbo].[DAY_PRESENT];
select * from [dbo].[Resource_INDISPO];
select * from [dbo].[Resource_ABSENCE];
-- Drop Sessions2 table if it exists
if object_id('dbo.Sessions2') is not null
drop table dbo.Sessions2;
-- Create the Sessions2 table
CREATE TABLE dbo.Sessions2
(
id INT NOT NULL IDENTITY(1, 1),
ID_Resource INT NOT NULL,
Day_indispo DATE NOT NULL,
MINUTE_Start INT NOT NULL,
MINUTE_End INT NOT NULL,
CONSTRAINT PK_Sessions PRIMARY KEY(id),
CONSTRAINT CHK_endtime_gteq_starttime
CHECK (MINUTE_End >= MINUTE_Start)
);
CREATE UNIQUE INDEX idx_user_start_id ON dbo.Sessions2(ID_Resource, Day_indispo, MINUTE_Start, id);
CREATE UNIQUE INDEX idx_user_end_id ON dbo.Sessions2(ID_Resource, Day_indispo, MINUTE_End, id);
insert into dbo.Sessions2(ID_Resource,Day_indispo,MINUTE_Start,MINUTE_End)
select
ri.ID_Resource,
ri.Day_indispo,
ri.MINUTE_Start,
ri.MINUTE_End
from
[dbo].[Resource_INDISPO] ri
union all
select
ra.CODE_Resource,
ra.Day_ABSENCE,
ra.MINUTE_start,
ra.MINUTE_end
from
[dbo].[Resource_ABSENCE] ra
order by
ID_Resource,
Day_indispo,
MINUTE_Start,
MINUTE_End;
/*
The following code is modified from code written by Itzek Ben-Gan and found in the article located here:
http://sqlmag.com/blog/solutions-packing-date-and-time-intervals-puzzle
*/
-- indexes
/*
CREATE UNIQUE INDEX idx_user_start_id ON dbo.Sessions(username, starttime, id);
CREATE UNIQUE INDEX idx_user_end_id ON dbo.Sessions(username, endtime, id);
*/
WITH C1 AS
-- let e = end ordinals, let s = start ordinals
(
SELECT
s2.id,
s2.ID_Resource,
s2.Day_indispo,
s2.MINUTE_Start AS ts,
+1 AS type,
NULL AS e,
ROW_NUMBER() OVER(PARTITION BY s2.ID_Resource,s2.Day_indispo ORDER BY s2.MINUTE_Start, s2.id) AS s
FROM
dbo.Sessions2 s2
inner join dbo.DAY_PRESENT dp
on (dp.ID_Resource = s2.ID_Resource and
dp.DAY_Present = s2.Day_indispo)
UNION ALL
SELECT
s2.id,
s2.ID_Resource,
s2.Day_indispo,
s2.MINUTE_End AS ts,
-1 AS type,
ROW_NUMBER() OVER(PARTITION BY s2.ID_Resource, s2.Day_indispo ORDER BY s2.MINUTE_End, id) AS e,
NULL AS s
FROM
dbo.Sessions2 s2
inner join dbo.DAY_PRESENT dp
on (dp.ID_Resource = s2.ID_Resource and
dp.DAY_Present = s2.Day_indispo)
),
C2 AS
-- let se = start or end ordinal, namely, how many events (start or end) happened so far
(
SELECT C1.*, ROW_NUMBER() OVER(PARTITION BY ID_Resource,Day_indispo ORDER BY ts, type DESC, id) AS se
FROM C1
),
C3 AS
-- For start events, the expression s - (se - s) - 1 represents how many sessions were active
-- just before the current (hence - 1)
--
-- For end events, the expression (se - e) - e represents how many sessions are active
-- right after this one
--
-- The above two expressions are 0 exactly when a group of packed intervals
-- either starts or ends, respectively
--
-- After filtering only events when a group of packed intervals either starts or ends,
-- group each pair of adjacent start/end events
(
SELECT
ID_Resource,
Day_indispo,
ts,
FLOOR((ROW_NUMBER() OVER(PARTITION BY ID_Resource,Day_indispo ORDER BY ts) - 1) / 2 + 1) AS grpnum
FROM
C2
WHERE
COALESCE(s - (se - s) - 1, (se - e) - e) = 0
), basedata as (
SELECT
ID_Resource,
Day_indispo,
MIN(ts) AS starttime,
max(ts) AS endtime
FROM
C3
GROUP BY
ID_Resource,
Day_indispo,
grpnum
) -- select * from basedata -- Uncomment run from here up the WITH (CTE declaration) to see the independent time intervals
select
bd.ID_Resource,
bd.Day_indispo,
1440 - sum(endtime - starttime)
from
basedata bd
group by
bd.ID_Resource,
bd.Day_indispo
order by
bd.ID_Resource,
bd.Day_indispo;
Nice code Lynn 🙂
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
July 1, 2014 at 3:27 pm
based on Lynn's previous code...I think the performance can be improved ... comments in the code
test script:...about 20secs and hopefully provides a test sample that imitates your real world
IF OBJECT_ID('tempdb..DAY_PRESENT', 'U') IS NOT NULL DROP TABLE tempdb..DAY_PRESENT;
IF OBJECT_ID('tempdb..Resource_INDISPO', 'U') IS NOT NULLDROP TABLE tempdb..Resource_INDISPO;
IF OBJECT_ID('tempdb..Resource_ABSENCE', 'U') IS NOT NULLDROP TABLE tempdb..Resource_ABSENCE;
IF OBJECT_ID('tempdb..Tally', 'U') IS NOT NULL DROP TABLE tempdb..Tally;
CREATE TABLE Tally (N INT CONSTRAINT NPK PRIMARY KEY CLUSTERED (N));
WITH
L0 AS(SELECT 1 AS C UNION ALL SELECT 1 AS O), -- 2 rows
L1 AS(SELECT 1 AS C FROM L0 AS A CROSS JOIN L0 AS B), -- 4 rows
L2 AS(SELECT 1 AS C FROM L1 AS A CROSS JOIN L1 AS B), -- 16 rows
L3 AS(SELECT 1 AS C FROM L2 AS A CROSS JOIN L2 AS B), -- 256 rows
L4 AS(SELECT 1 AS C FROM L3 AS A CROSS JOIN L3 AS B), -- 65,536 rows
--L5 AS(SELECT 1 AS C FROM L4 AS A CROSS JOIN L4 AS B), -- 4,294,967,296 rows
Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS N FROM L4)
INSERT INTO Tally SELECT TOP 10000 N FROM Nums ORDER BY N;
CREATE TABLE DAY_PRESENT (
ID_Resource INT NOT NULL
, DAY_Present SMALLDATETIME NOT NULL
);
CREATE TABLE Resource_INDISPO (
id_resource_indispo int identity (1,1) not null
, ID_Resource INT NOT NULL
, Day_indispo SMALLDATETIME NOT NULL
, MINUTE_Start INT NOT NULL
, MINUTE_End INT NOT NULL
);
CREATE TABLE Resource_ABSENCE (
ID_Resource_ABSENCE int IDENTITY(1,1) NOT NULL
,CODE_Resource INT NOT NULL
, Day_ABSENCE SMALLDATETIME NOT NULL
, MINUTE_start INT NULL
, MINUTE_end INT NULL
);
DECLARE @Date_Start AS DATETIME
DECLARE @Date_End AS DATETIME
SET @Date_Start = '2014-01-01'
SET @Date_End = '2014-12-31';
with c1 as ( select top 1400 N from tally)
,
c2 as (
SELECT dateadd(day, t.n-1, @Date_Start) thedate
FROM Tally t
WHERE dateadd(day, t.n-1, @Date_Start) <= @Date_End)
INSERT INTO [dbo].[DAY_PRESENT]
([ID_Resource],[DAY_Present])
SELECT c1.n, c2.thedate
FROM c1 CROSS JOIN c2
;
IF OBJECT_ID('tempdb..IND_setup', 'U') IS NOT NULL
DROP TABLE tempdb..IND_setup;
CREATE TABLE [dbo].[IND_setup](
[Minute_start] [int] NULL,
[Minute_end] [int] NULL
)
INSERT [dbo].[IND_setup] ([Minute_start], [Minute_end]) VALUES (0, 495)
INSERT [dbo].[IND_setup] ([Minute_start], [Minute_end]) VALUES (720, 780)
--INSERT [dbo].[IND_setup] ([Minute_start], [Minute_end]) VALUES (900, 915)
INSERT [dbo].[IND_setup] ([Minute_start], [Minute_end]) VALUES (1005, 1440)
INSERT INTO [dbo].[Resource_INDISPO]
([ID_Resource]
,[Day_indispo]
,[MINUTE_Start]
,[MINUTE_End])
SELECT DAY_PRESENT.ID_Resource, DAY_PRESENT.DAY_Present, IND_setup.Minute_start, IND_setup.Minute_end
FROM DAY_PRESENT CROSS JOIN IND_setup
INSERT INTO [dbo].[Resource_ABSENCE]
(
[CODE_Resource]
,[Day_ABSENCE]
,[MINUTE_start]
)
SELECT
ID_Resource
, Day_indispo
,MINUTE_Start = CAST(Abs(Checksum(Newid()) % 1005 ) AS INT)
FROM Resource_INDISPO
WHERE id_resource_indispo % 8 = 0
UPDATE Resource_ABSENCE
SET MINUTE_end = MINUTE_start + CAST(ABS(Checksum(NEWID()) % 435) AS INT)
UPDATE Resource_ABSENCE
SET MINUTE_start = 0, MINUTE_end = 1440
WHERE (ID_Resource_ABSENCE % 21 = 0)
CREATE UNIQUE INDEX [UIX_DP_JLS] ON [dbo].[DAY_PRESENT]
([ID_Resource] ASC,
[DAY_Present] ASC
)
CREATE UNIQUE INDEX [UIX_RI_JLS] ON [dbo].[Resource_INDISPO]
(
[ID_Resource] ASC,
[Day_indispo] ASC,
[MINUTE_Start] ASC,
[MINUTE_End] ASC
)
CREATE INDEX [IX_RA_JLS] ON [dbo].[Resource_ABSENCE]
(
[CODE_Resource] ASC,
[Day_ABSENCE] ASC,
[MINUTE_start] ASC,
[MINUTE_end] ASC
)
SELECT COUNT(*) AS DAY_PRESENT_cnt FROM DAY_PRESENT
SELECT COUNT(*) AS Resource_INDISPO_cnt FROM Resource_INDISPO
SELECT COUNT(*) AS Resource_ABSENCEcnt FROM Resource_ABSENCE
SELECT MIN(MINUTE_start) AS minstart
, MAX(MINUTE_start) AS maxstart
, MIN(MINUTE_end) AS minend
, MAX(MINUTE_end) AS maxend
, MIN(MINUTE_end - MINUTE_start) mindiff
, MAX(MINUTE_end - MINUTE_start) maxdiff
FROM Resource_ABSENCE
revised script.....based on Lynn's code:
--SELECT COUNT(*) AS cnt FROM DAY_PRESENT
--SELECT COUNT(*) AS cnt FROM Resource_INDISPO
--SELECT COUNT(*) AS cnt FROM Resource_ABSENCE
IF OBJECT_ID('tempdb..results_jls', 'U') IS NOT NULL DROP TABLE tempdb..results_jls;
IF OBJECT_ID('tempdb..Day_theo', 'U') IS NOT NULL DROP TABLE tempdb..Day_theo ;
SELECT ri.ID_Resource AS userid
, ri.Day_indispo AS thedate
, 1440 - SUM(ri.MINUTE_End - ri.MINUTE_Start) AS Theo_Hrs
INTO Day_theo
FROM Resource_INDISPO AS ri
INNER JOIN DAY_PRESENT AS dp
ON ri.ID_Resource = dp.ID_Resource AND ri.Day_indispo = dp.DAY_Present
GROUP BY ri.ID_Resource
, ri.Day_indispo
/*The actual working hours will be the same as the theoretical hours unless there are records in Resource_ABSENCE
By only working with records that have absence this will reduce the number of rows we have to deal with
*/
IF OBJECT_ID('tempdb..Day_absent', 'U') IS NOT NULL DROP TABLE tempdb..Day_absent ;
SELECT dp.ID_Resource AS userid
, dp.DAY_Present AS thedate
INTO Day_absent
FROM DAY_PRESENT AS dp
INNER JOIN Resource_ABSENCE AS ra
ON dp.ID_Resource = ra.CODE_Resource AND dp.DAY_Present = ra.Day_ABSENCE
GROUP BY dp.ID_Resource
, dp.DAY_Present
CREATE UNIQUE CLUSTERED INDEX [CIX_DA_JLS] ON [dbo].[Day_absent]
([userid] ASC,[thedate] ASC)
/*following is shamelessly based on Lynn Petis's excellent code for this problem*/
;with sessions as (
SELECT ri.ID_Resource as userid
, ri.Day_indispo as thedate
, ri.MINUTE_Start
, ri.MINUTE_End
FROM Resource_INDISPO ri
INNER JOIN Day_absent da
ON ri.ID_Resource = da.userid AND ri.Day_indispo = da.thedate
UNION ALL
SELECT ra.CODE_Resource
, ra.Day_ABSENCE
, ra.MINUTE_start
, ra.MINUTE_end
FROM Resource_ABSENCE AS ra
INNER JOIN Day_absent da
ON ra.CODE_Resource = da.userid AND ra.Day_ABSENCE = da.thedate
)
, sessions2 as (
SELECT *
, ROW_NUMBER() OVER (ORDER BY userid, thedate, MINUTE_Start, MINUTE_End) id
FROM sessions
)
/*
The following code is modified from code written by Itzek Ben-Gan and found in the article located here:
http://sqlmag.com/blog/solutions-packing-date-and-time-intervals-puzzle
*/
, C1 AS
-- let e = end ordinals, let s = start ordinals
(
SELECT
s2.id,
s2.userid,
s2.thedate,
s2.MINUTE_Start AS ts,
+1 AS type,
NULL AS e,
ROW_NUMBER() OVER(PARTITION BY s2.userid,s2.thedate ORDER BY s2.userid,s2.thedate,s2.MINUTE_Start) AS s
FROM
Sessions2 s2
UNION ALL
SELECT
s2.id,
s2.userid,
s2.thedate,
s2.MINUTE_End AS ts,
-1 AS type,
ROW_NUMBER() OVER(PARTITION BY s2.userid,s2.thedate ORDER BY s2.userid,s2.thedate, s2.MINUTE_End) AS e,
NULL AS s
FROM
Sessions2 s2
),
C2 AS
-- let se = start or end ordinal, namely, how many events (start or end) happened so far
(
SELECT C1.*
, ROW_NUMBER() OVER (PARTITION BY userid, thedate ORDER BY ts, type DESC, id) AS se
FROM C1
),
C3 AS
-- For start events, the expression s - (se - s) - 1 represents how many sessions were active
-- just before the current (hence - 1)
--
-- For end events, the expression (se - e) - e represents how many sessions are active
-- right after this one
--
-- The above two expressions are 0 exactly when a group of packed intervals
-- either starts or ends, respectively
--
-- After filtering only events when a group of packed intervals either starts or ends,
-- group each pair of adjacent start/end events
(
SELECT userid
, thedate
, ts
, FLOOR((ROW_NUMBER() OVER (PARTITION BY userid,thedate ORDER BY ts) - 1) / 2 + 1) AS grpnum
FROM C2
WHERE COALESCE(s - (se - s) - 1, (se - e) - e) = 0
)
, basedata as (
SELECT userid
, thedate
, MIN(ts) AS starttime
, max(ts) AS endtime
FROM C3
GROUP BY userid
, thedate
, grpnum
)
, basedate_sum as (
SELECT bd.userid
, bd.thedate
, 1440 - sum(endtime - starttime) bd
FROM basedata bd
GROUP BY bd.userid
, bd.thedate
)
/*put it back together to get RESULTS*/
SELECT t.userid
, t.thedate
, t.Theo_Hrs
, CASE WHEN bs.bd IS NULL THEN t.Theo_Hrs ELSE bs.bd END as TheoLessAbs
into results_jls
FROM Day_theo t
LEFT JOIN basedate_sum bs
ON t.userid = bs.userid AND t.thedate = bs.thedate
--SELECT userid, convert(date,thedate), Theo_Hrs, TheoLessAbs
--FROM results_jls
----WHERE
--ORDER BY userid, thedate
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
July 1, 2014 at 3:32 pm
J Livingston SQL (7/1/2014)
Lynn Pettis (6/29/2014)
How about giving this a shot. You may also be able to run this against all ID_Resource entities at once instead of one at a time.I'd try it in a test environment. You may need to make some mods if it isn't exactly what you need.
--Show data in sampe tables
select * from [dbo].[DAY_PRESENT];
select * from [dbo].[Resource_INDISPO];
select * from [dbo].[Resource_ABSENCE];
-- Drop Sessions2 table if it exists
if object_id('dbo.Sessions2') is not null
drop table dbo.Sessions2;
-- Create the Sessions2 table
CREATE TABLE dbo.Sessions2
(
id INT NOT NULL IDENTITY(1, 1),
ID_Resource INT NOT NULL,
Day_indispo DATE NOT NULL,
MINUTE_Start INT NOT NULL,
MINUTE_End INT NOT NULL,
CONSTRAINT PK_Sessions PRIMARY KEY(id),
CONSTRAINT CHK_endtime_gteq_starttime
CHECK (MINUTE_End >= MINUTE_Start)
);
CREATE UNIQUE INDEX idx_user_start_id ON dbo.Sessions2(ID_Resource, Day_indispo, MINUTE_Start, id);
CREATE UNIQUE INDEX idx_user_end_id ON dbo.Sessions2(ID_Resource, Day_indispo, MINUTE_End, id);
insert into dbo.Sessions2(ID_Resource,Day_indispo,MINUTE_Start,MINUTE_End)
select
ri.ID_Resource,
ri.Day_indispo,
ri.MINUTE_Start,
ri.MINUTE_End
from
[dbo].[Resource_INDISPO] ri
union all
select
ra.CODE_Resource,
ra.Day_ABSENCE,
ra.MINUTE_start,
ra.MINUTE_end
from
[dbo].[Resource_ABSENCE] ra
order by
ID_Resource,
Day_indispo,
MINUTE_Start,
MINUTE_End;
/*
The following code is modified from code written by Itzek Ben-Gan and found in the article located here:
http://sqlmag.com/blog/solutions-packing-date-and-time-intervals-puzzle
*/
-- indexes
/*
CREATE UNIQUE INDEX idx_user_start_id ON dbo.Sessions(username, starttime, id);
CREATE UNIQUE INDEX idx_user_end_id ON dbo.Sessions(username, endtime, id);
*/
WITH C1 AS
-- let e = end ordinals, let s = start ordinals
(
SELECT
s2.id,
s2.ID_Resource,
s2.Day_indispo,
s2.MINUTE_Start AS ts,
+1 AS type,
NULL AS e,
ROW_NUMBER() OVER(PARTITION BY s2.ID_Resource,s2.Day_indispo ORDER BY s2.MINUTE_Start, s2.id) AS s
FROM
dbo.Sessions2 s2
inner join dbo.DAY_PRESENT dp
on (dp.ID_Resource = s2.ID_Resource and
dp.DAY_Present = s2.Day_indispo)
UNION ALL
SELECT
s2.id,
s2.ID_Resource,
s2.Day_indispo,
s2.MINUTE_End AS ts,
-1 AS type,
ROW_NUMBER() OVER(PARTITION BY s2.ID_Resource, s2.Day_indispo ORDER BY s2.MINUTE_End, id) AS e,
NULL AS s
FROM
dbo.Sessions2 s2
inner join dbo.DAY_PRESENT dp
on (dp.ID_Resource = s2.ID_Resource and
dp.DAY_Present = s2.Day_indispo)
),
C2 AS
-- let se = start or end ordinal, namely, how many events (start or end) happened so far
(
SELECT C1.*, ROW_NUMBER() OVER(PARTITION BY ID_Resource,Day_indispo ORDER BY ts, type DESC, id) AS se
FROM C1
),
C3 AS
-- For start events, the expression s - (se - s) - 1 represents how many sessions were active
-- just before the current (hence - 1)
--
-- For end events, the expression (se - e) - e represents how many sessions are active
-- right after this one
--
-- The above two expressions are 0 exactly when a group of packed intervals
-- either starts or ends, respectively
--
-- After filtering only events when a group of packed intervals either starts or ends,
-- group each pair of adjacent start/end events
(
SELECT
ID_Resource,
Day_indispo,
ts,
FLOOR((ROW_NUMBER() OVER(PARTITION BY ID_Resource,Day_indispo ORDER BY ts) - 1) / 2 + 1) AS grpnum
FROM
C2
WHERE
COALESCE(s - (se - s) - 1, (se - e) - e) = 0
), basedata as (
SELECT
ID_Resource,
Day_indispo,
MIN(ts) AS starttime,
max(ts) AS endtime
FROM
C3
GROUP BY
ID_Resource,
Day_indispo,
grpnum
) -- select * from basedata -- Uncomment run from here up the WITH (CTE declaration) to see the independent time intervals
select
bd.ID_Resource,
bd.Day_indispo,
1440 - sum(endtime - starttime)
from
basedata bd
group by
bd.ID_Resource,
bd.Day_indispo
order by
bd.ID_Resource,
bd.Day_indispo;
Nice code Lynn 🙂
Thanks, but the credit goes to Itzek Ben-Gan as it is his code at heart. All I did was modify it for this situation.
I did take the OPs sample data and duplicate it 10,000 times increasing the ID_Resource or CODE_Resource values to generate additional sample data (10,001 unique ID_Resource values). Ran the code to populate the Sessions2 table (can't remember how long that ran) then ran the code that did the actual work and including displaying the result it ran in 13 to 16 seconds (ran it several times) and returned 550,055 rows of data.
July 1, 2014 at 8:06 pm
Took a queue from J Livingston SQL's code and rewrote mine. I think it is comparable to his, but compared to mine, about 4 times faster, and reads down from 3,522,342 to 52,143. It is a bit higher on CPU.
-- Created the following indexes on the source tables for the data:
--CREATE UNIQUE INDEX idx_INDISPO_start_id ON dbo.Resource_INDISPO(ID_Resource, Day_indispo, MINUTE_Start, [id_resource_indispo]);
--CREATE UNIQUE INDEX idx_INDISPO_end_id ON dbo.Resource_INDISPO(ID_Resource, Day_indispo, MINUTE_End, [id_resource_indispo]);
--CREATE UNIQUE INDEX idx_ABSENCE_start_id ON dbo.Resource_ABSENCE(CODE_Resource, Day_ABSENCE, MINUTE_start, [ID_Resource_ABSENCE]);
--CREATE UNIQUE INDEX idx_ABSENCE_end_id ON dbo.Resource_ABSENCE(CODE_Resource, Day_ABSENCE, MINUTE_end, [ID_Resource_ABSENCE]);
/*
The following code is modified from code written by Itzek Ben-Gan and found in the article located here:
http://sqlmag.com/blog/solutions-packing-date-and-time-intervals-puzzle
*/
-- indexes
/*
CREATE UNIQUE INDEX idx_user_start_id ON dbo.Sessions(username, starttime, id);
CREATE UNIQUE INDEX idx_user_end_id ON dbo.Sessions(username, endtime, id);
*/
WITH Sessions2a as (
select
ri.id_resource_indispo,
ri.ID_Resource,
ri.Day_indispo,
ri.MINUTE_Start,
ri.MINUTE_End
from
[dbo].[Resource_INDISPO] ri
where
--ri.ID_Resource = 9662 and
exists(select 1 from [dbo].[Resource_ABSENCE] ra where ra.CODE_Resource = ri.ID_Resource and ra.Day_ABSENCE = ri.Day_indispo)
union all
select
ra.ID_Resource_ABSENCE,
ra.CODE_Resource,
ra.Day_ABSENCE,
ra.MINUTE_start,
ra.MINUTE_end
from
[dbo].[Resource_ABSENCE] ra
--where
-- CODE_Resource = 9662
), C1 AS
-- let e = end ordinals, let s = start ordinals
(
SELECT
s2.id_resource_indispo,
s2.ID_Resource,
s2.Day_indispo,
s2.MINUTE_Start AS ts,
+1 AS type,
NULL AS e,
ROW_NUMBER() OVER(PARTITION BY s2.ID_Resource,s2.Day_indispo ORDER BY s2.MINUTE_Start, s2.id_resource_indispo) AS s
FROM
Sessions2a s2
inner join dbo.DAY_PRESENT dp
on (dp.ID_Resource = s2.ID_Resource and
dp.DAY_Present = s2.Day_indispo)
UNION ALL
SELECT
s2.id_resource_indispo,
s2.ID_Resource,
s2.Day_indispo,
s2.MINUTE_End AS ts,
-1 AS type,
ROW_NUMBER() OVER(PARTITION BY s2.ID_Resource, s2.Day_indispo ORDER BY s2.MINUTE_End, id_resource_indispo) AS e,
NULL AS s
FROM
Sessions2a s2
inner join dbo.DAY_PRESENT dp
on (dp.ID_Resource = s2.ID_Resource and
dp.DAY_Present = s2.Day_indispo)
),
C2 AS
-- let se = start or end ordinal, namely, how many events (start or end) happened so far
(
SELECT C1.*, ROW_NUMBER() OVER(PARTITION BY ID_Resource,Day_indispo ORDER BY ts, type DESC, id_resource_indispo) AS se
FROM C1
),
C3 AS
-- For start events, the expression s - (se - s) - 1 represents how many sessions were active
-- just before the current (hence - 1)
--
-- For end events, the expression (se - e) - e represents how many sessions are active
-- right after this one
--
-- The above two expressions are 0 exactly when a group of packed intervals
-- either starts or ends, respectively
--
-- After filtering only events when a group of packed intervals either starts or ends,
-- group each pair of adjacent start/end events
(
SELECT
ID_Resource,
Day_indispo,
ts,
FLOOR((ROW_NUMBER() OVER(PARTITION BY ID_Resource,Day_indispo ORDER BY ts) - 1) / 2 + 1) AS grpnum
FROM
C2
WHERE
COALESCE(s - (se - s) - 1, (se - e) - e) = 0
), basedata as (
SELECT
ID_Resource,
Day_indispo,
MIN(ts) AS starttime,
max(ts) AS endtime
FROM
C3
GROUP BY
ID_Resource,
Day_indispo,
grpnum
) -- select * from basedata order by ID_Resource, Day_indispo, starttime-- Uncomment run from here up the WITH (CTE declaration) to see the independent time intervals
select
bd.ID_Resource,
bd.Day_indispo,
1440 - sum(endtime - starttime) ActualTime
from
basedata bd
group by
bd.ID_Resource,
bd.Day_indispo
--order by
-- bd.ID_Resource,
-- bd.Day_indispo;
union all
select
ri.ID_Resource,
ri.Day_indispo,
1440 - sum(ri.MINUTE_End - ri.MINUTE_Start)
from
[dbo].[Resource_INDISPO] ri
inner join [dbo].[DAY_PRESENT] dp
on (dp.ID_Resource = ri.ID_Resource and dp.DAY_Present = ri.Day_indispo)
where
not exists(select 1 from [dbo].[Resource_ABSENCE] ra where ri.ID_Resource = ra.CODE_Resource and ri.Day_indispo = ra.Day_ABSENCE)
group by
ri.ID_Resource,
ri.Day_indispo
order by
bd.ID_Resource,
bd.Day_indispo;
July 3, 2014 at 6:31 am
Hi,
Thank you for all your answers and proposed solutions
I applied that Lynn
With my method it takes 15 minutes and 12 minutes with the Lynn solution
I think I'll have to look again but by reviewing the entire process
Regards
Arno
July 3, 2014 at 6:39 am
Arno Ho (7/3/2014)
Hi,Thank you for all your answers and proposed solutions
I applied that Lynn
With my method it takes 15 minutes and 12 minutes with the Lynn solution
I think I'll have to look again but by reviewing the entire process
Regards
Arno
bit confused ......on the test set up script I provided ...both Lynn's code and mine ran substantially faster than what you are describing.
have you tried your code against the test script so that we can compare directly?
if the test script is not representative of your data...please advise and I will try and come up with something that is more your real world.
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
July 3, 2014 at 8:30 am
J Livingston SQL (7/3/2014)
Arno Ho (7/3/2014)
Hi,Thank you for all your answers and proposed solutions
I applied that Lynn
With my method it takes 15 minutes and 12 minutes with the Lynn solution
I think I'll have to look again but by reviewing the entire process
Regards
Arno
bit confused ......on the test set up script I provided ...both Lynn's code and mine ran substantially faster than what you are describing.
have you tried your code against the test script so that we can compare directly?
if the test script is not representative of your data...please advise and I will try and come up with something that is more your real world.
I agree. With 3,250,325 rows in the main table mine ran in about 3 seconds.
July 3, 2014 at 8:42 am
.[/quote]
I agree. With 3,250,325 rows in the main table mine ran in about 3 seconds.
[/quote]
what spec hardware you running Lynn.....?
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
July 3, 2014 at 10:43 am
J Livingston SQL (7/3/2014)
.
I agree. With 3,250,325 rows in the main table mine ran in about 3 seconds.
[/quote]
what spec hardware you running Lynn.....?[/quote]
Dell M4600 laptop, dual CORE i7 with 4 core each, 8GB RAM total, 2GB max setting for SQL Server 2012. Someone told me it has an SSD drive but I'm not too sure about that. Haven't dug into it as it is my work laptop.
July 3, 2014 at 1:02 pm
Lynn Pettis (7/3/2014)
J Livingston SQL (7/3/2014)
.I agree. With 3,250,325 rows in the main table mine ran in about 3 seconds.
what spec hardware you running Lynn.....?[/quote]
Dell M4600 laptop, dual CORE i7 with 4 core each, 8GB RAM total, 2GB max setting for SQL Server 2012. Someone told me it has an SSD drive but I'm not too sure about that. Haven't dug into it as it is my work laptop.
[/quote]
hehe...that explains why my single 4 core I5 and 4gb Ram (32bit os) wasn't achieving 3 secs like you...best I could get was around 7/8 secs 😛
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
Viewing 15 posts - 16 through 30 (of 35 total)
You must be logged in to reply to this topic. Login to reply