June 14, 2017 at 8:23 pm
I have a source that contains rows for applications that have experienced issues aka 'Alerts', their issuestarttimes, and resolutiontimes. I made a bar graph conveying outage trends when scrolled left/right. IE. If application had an open issue, bar graph captures the hours of day that the application was down and allocates hours to the day on which the outage occurred. Thus the visual will convey a trend of outages when scrolled left right, ie. denser for applications always experiencing issues. (By putting SUM(HoursDown) on the values for bar graph, the bars will rise between 0 - 24. If zero, then no bar on the date means no outage that day)
I made two solutions to create the report dataset needed for the visual. In first, I run the query for applications, one month at a time. In the second I run the query once per application...for the whole year each time. Second solution takes very long against real life dataset having hundreds of applications and data.
I run the below 1x each for (1) April 1 - Apr 30, (2) May 1 - May 31, and (3)Jun 1 - Jun 30. Though it isn't needed in sample dataset, in real life the datasets are very large and I would not be able to process 1 year's worth of data at a time. I like solution 1 but if run 3 different times for each months, records on the last day of every month are lost. If run for all 3 months at same time, no records are lost.
USE Sandbox
GO
DECLARE
@FromDate AS DATETIME,
@ToDate AS DATETIME
--change date range and rerun query until all months of year are processed
SET @FromDate = '2017-06-01 00:00:00.000'
SET @ToDate = '2017-06-30 00:00:00.000'
INSERT INTO OT
select distinct
Tool
, right('0'+ DATENAME(dd, cte_start_date),2) + '-' + LEFT(DATENAME(MONTH, cte_start_date),3) + '-' + DATENAME(YEAR, cte_start_date)DayMonth
, DATENAME(MONTH, cte_start_date) + '-' + DATENAME(YEAR, cte_start_date) MonthYear
, DATENAME(YEAR, cte_start_date) Year
, cte_start_date
, HoursDown
--INTO OT
from
(
SELECT distinct
Apps.[AppName] as Tool
, CrsApp.cte_start_date
, MAX(
CASE
WHEN CrsApp.cte_start_date >= A.[issuestarttime] AND CrsApp.cte_start_date <= [resolutiontime] THEN 1
ELSE 0
END ) as HoursDown
FROM
alerts A
JOIN [dbo].[Alert_Application] Appset on A.alertid = Appset.alertid
JOIN [dbo].[Applications] Apps on Appset.applicationid = Apps.applicationid
OUTER APPLY (SELECT t.cte_start_date FROM Time t
WHERE cte_start_date >= @FromDate AND cte_start_date <= @ToDate) CrsApp
GROUP BY
Apps.[AppName], CrsApp.cte_start_date
)t
Solution 2, run 1x per application, will grow the correct dataset since I am providing whole year datarange, but I want to avoid this, as it requires a cursor and protracts the ETL.
USE Sandbox
GO
DECLARE
@FromDate AS DATETIME,
@ToDate AS DATETIME
--set dates once, covering whole date range needed
SET @FromDate = '2016-06-01 00:00:00.000'
SET @ToDate = '2017-06-30 00:00:00.000'
INSERT INTO OTApp
select distinct
Tool
, right('0'+ DATENAME(dd, cte_start_date),2) + '-' + LEFT(DATENAME(MONTH, cte_start_date),3) + '-' + DATENAME(YEAR, cte_start_date)DayMonth
, DATENAME(MONTH, cte_start_date) + '-' + DATENAME(YEAR, cte_start_date) MonthYear
, DATENAME(YEAR, cte_start_date) Year
, cte_start_date
, HoursDown
--INTO OTApp
from
(
SELECT distinct
Apps.[AppName] as Tool
, CrsApp.cte_start_date
, MAX(
CASE
WHEN CrsApp.cte_start_date >= A.[issuestarttime] AND CrsApp.cte_start_date <= [resolutiontime] THEN 1
ELSE 0
END ) as HoursDown
FROM
alerts A
JOIN [dbo].[Alert_Application] Appset on A.alertid = Appset.alertid
JOIN [dbo].[Applications] Apps on Appset.applicationid = Apps.applicationid
OUTER APPLY (SELECT t.cte_start_date FROM Time t
WHERE cte_start_date >= @FromDate AND cte_start_date <= @ToDate) CrsApp
--rerun query as many times are there are applications
where AppName = 'VeritVru'
GROUP BY
Apps.[AppName], CrsApp.cte_start_date
)t
DDL to create dataset for running above queries
CREATE TABLE Alerts (
AlertID int,
AlertName varchar(100),
ApplicationName varchar(50),
IssueStartTime datetime,
ResolutionTime datetime,
DateKey int)
INSERT INTO Alerts (AlertID,AlertName,ApplicationName,IssueStartTime,ResolutionTime, Datekey)
VALUES
(160922, 'Can''t login', 'Shapeeze', '2017-05-08 12:00:00.000', '2017-05-23 18:00:00.000', '20170508'),
(160923, 'Screens freezing', 'Shapeeze', '2017-05-29 12:00:00.000', '2017-06-03 10:00:00.000', '20170529'),
(150319, 'Errors Submitting Forms','Shapeeze', '2017-05-23 06:00:00.000', '2017-05-29 13:00:00.000', '20170523'),
(210522, 'Server down', 'Napson', '2017-05-01 03:00:00.000','2017-05-01 23:00:00.000', '20170501'),
(210525, 'Updates failing', 'Napson', '2017-05-30 03:00:00.000','2017-06-10 12:00:00.000', '20170530'),
(120311, 'Customers Being Misrouted', 'Strikeone','2017-05-15 16:00:00.000', '2017-05-19 09:00:00.000', '20170515'),
(120313, 'Customers Deadending', 'Strikeone','2017-04-15 16:00:00.000', '2017-06-12 09:00:00.000', '20170515'),
(043293, 'Calls dropping', 'VeritVru','2017-04-12 03:00:00.000', '2017-04-12 09:00:00.000', '20170412'),
(043297, 'Static on calls', 'VeritVru','2017-04-27 03:00:00.000', '2017-05-06 09:00:00.000', '20170412')
CREATE TABLE Alert_Application (
AlertID int,
ApplicationID int);
INSERT INTO Alert_Application
VALUES
(160922, 1001),
(160923, 1001),
(150319, 1001),
(210522, 2000),
(210525, 2000),
(120311, 3000),
(120313, 3000),
(43293, 4000),
(43297, 4000);
CREATE TABLE Applications (
ApplicationID int,
AppName varchar(20))
INSERT INTO Applications
VALUES
(1001, 'Shapeeze'),
(2000, 'Napson'),
(3000, 'StrikeOne'),
(4000, 'VeritVRU');
CREATE TABLE LOB_Application(
ApplicationID int,
LOBID int);
INSERT INTO Applications
VALUES
(1001, 888),
(2000, 888),
(3000, 777),
(4000, 777);
CREATE TABLE LOB (
LOBID int,
LOBName varchar(20));
INSERT INTO LOB
VALUES
(888, 'BigLeague'),
(777, 'LeftArch');
CREATE TABLE Agent_LOB (
LOBID int,
AgentID int)
INSERT INTO Agent_LOB
VALUES
(888, 111),
(777, 222);
CREATE TABLE AgentSite (
AgentID int,
AgentSite varchar(20),
Country varchar(20),
CountryID int);
INSERT INTO AgentSite
VALUES
(111, 'Dublin', 'Ireland',234),
(222, 'Hyderabad', 'India',876)
CREATE TABLE Countries (
CountryID int,
Country varchar(20))
INSERT INTO Countries
VALUES
(234, 'Ireland'),
(876, 'India')
--Time table
;WITH e1(n) AS(
SELECT * FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1))x(n)
),
e2(n) AS(
SELECT e1.n FROM e1, e1 x
),
e4(n) AS(
SELECT e2.n FROM e2, e2 x
),
e8(n) AS(
SELECT e4.n FROM e4, e4 x
),
cteTally(n) AS(
SELECT TOP 6307204 ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) - 1
FROM e8
),
Test(n) AS(
SELECT DATEADD( hh, n, DATEADD( year, -2, GETDATE()))
FROM cteTally)
select DATEADD(hh,datediff(hh,0,n),0)as cte_start_date
INTO dbo.Time
FROM Test
WHERE n <= DATEADD( YEAR, 1, GETDATE())
Solution one visual, dips end of month (on April 30), WRONG :
Solution 2, no dip on April 30, CORRECT:
I am looking for cause of records dropping at end of month when solution 1 is run for 1 month date ranges at a time. And if any ideas of fix, would be appreciated.
--Quote me
June 15, 2017 at 1:44 am
Slightly different approach using inline calendar CTE
😎
BTW the sample data set is missing the table "Time"
USE TEEST;
GO
SET NOCOUNT ON;
--/*
IF OBJECT_ID(N'dbo.Alerts') IS NOT NULL DROP TABLE dbo.Alerts;
CREATE TABLE dbo.Alerts (
AlertID int,
AlertName varchar(100),
ApplicationName varchar(50),
IssueStartTime datetime,
ResolutionTime datetime,
DateKey int)
INSERT INTO dbo.Alerts (AlertID,AlertName,ApplicationName,IssueStartTime,ResolutionTime, Datekey)
VALUES
(160922, 'Can''t login', 'Shapeeze', '2017-05-08 12:00:00.000', '2017-05-23 18:00:00.000', '20170508'),
(160923, 'Screens freezing', 'Shapeeze', '2017-05-29 12:00:00.000', '2017-06-03 10:00:00.000', '20170529'),
(150319, 'Errors Submitting Forms','Shapeeze', '2017-05-23 06:00:00.000', '2017-05-29 13:00:00.000', '20170523'),
(210522, 'Server down', 'Napson', '2017-05-01 03:00:00.000','2017-05-01 23:00:00.000', '20170501'),
(210525, 'Updates failing', 'Napson', '2017-05-30 03:00:00.000','2017-06-10 12:00:00.000', '20170530'),
(120311, 'Customers Being Misrouted', 'Strikeone','2017-05-15 16:00:00.000', '2017-05-19 09:00:00.000', '20170515'),
(120313, 'Customers Deadending', 'Strikeone','2017-04-15 16:00:00.000', '2017-06-12 09:00:00.000', '20170515'),
(043293, 'Calls dropping', 'VeritVru','2017-04-12 03:00:00.000', '2017-04-12 09:00:00.000', '20170412'),
(043297, 'Static on calls', 'VeritVru','2017-04-27 03:00:00.000', '2017-05-06 09:00:00.000', '20170412')
IF OBJECT_ID(N'dbo.Alert_Application') IS NOT NULL DROP TABLE dbo.Alert_Application;
CREATE TABLE dbo.Alert_Application (
AlertID int,
ApplicationID int);
INSERT INTO dbo.Alert_Application
VALUES
(160922, 1001),
(160923, 1001),
(150319, 1001),
(210522, 2000),
(210525, 2000),
(120311, 3000),
(120313, 3000),
(43293, 4000),
(43297, 4000);
IF OBJECT_ID(N'dbo.Applications') IS NOT NULL DROP TABLE dbo.Applications;
CREATE TABLE dbo.Applications (
ApplicationID int,
AppName varchar(20))
INSERT INTO dbo.Applications
VALUES
(1001, 'Shapeeze'),
(2000, 'Napson'),
(3000, 'StrikeOne'),
(4000, 'VeritVRU');
IF OBJECT_ID(N'dbo.LOB_Application') IS NOT NULL DROP TABLE dbo.LOB_Application;
CREATE TABLE dbo.LOB_Application(
ApplicationID int,
LOBID int);
INSERT INTO dbo.LOB_Application
VALUES
(1001, 888),
(2000, 888),
(3000, 777),
(4000, 777);
IF OBJECT_ID(N'dbo.LOB') IS NOT NULL DROP TABLE dbo.LOB;
CREATE TABLE dbo.LOB (
LOBID int,
LOBName varchar(20));
INSERT INTO dbo.LOB
VALUES
(888, 'BigLeague'),
(777, 'LeftArch');
IF OBJECT_ID(N'dbo.Agent_LOB') IS NOT NULL DROP TABLE dbo.Agent_LOB;
CREATE TABLE dbo.Agent_LOB (
LOBID int,
AgentID int)
INSERT INTO dbo.Agent_LOB
VALUES
(888, 111),
(777, 222);
IF OBJECT_ID(N'dbo.AgentSite') IS NOT NULL DROP TABLE dbo.AgentSite;
CREATE TABLE dbo.AgentSite (
AgentID int,
AgentSite varchar(20),
Country varchar(20),
CountryID int);
INSERT INTO dbo.AgentSite
VALUES
(111, 'Dublin', 'Ireland',234),
(222, 'Hyderabad', 'India',876)
IF OBJECT_ID(N'dbo.Countries') IS NOT NULL DROP TABLE dbo.Countries;
CREATE TABLE dbo.Countries (
CountryID int,
Country varchar(20))
INSERT INTO dbo.Countries
VALUES
(234, 'Ireland'),
(876, 'India');
-- */
--change date range and rerun query until all months of year are processed
DECLARE @FromDate AS DATETIME = '2017-04-01 00:00:00.000'
DECLARE @ToDate AS DATETIME = '2017-06-30 00:00:00.000'
;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))
,BASE_CALENDAR_CONFIG AS
(
SELECT
CONVERT(DATE,@FromDate,0) AS FIRST_DATE
,DATEDIFF(DAY,CONVERT(DATE,@FromDate,0),CONVERT(DATE,@ToDate,0)) AS NUM_DAYS
)
,NUMS(N) AS (SELECT TOP((SELECT BCC.NUM_DAYS FROM BASE_CALENDAR_CONFIG BCC) + 1) ROW_NUMBER() OVER (ORDER BY (@@VERSION)) - 1 AS N
FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9)
,INLINE_CALENDAR AS
(
SELECT
NM.N AS DATE_NO
,DATEADD(DAY,NM.N,BC.FIRST_DATE) AS DATE_VAL
FROM BASE_CALENDAR_CONFIG BC
CROSS APPLY NUMS NM
)
,ALERT_DAYS AS
(
SELECT
A.AlertID
,A.IssueStartTime
,A.ResolutionTime
,CONVERT(DATE,A.IssueStartTime,0) AS FROM_DATE
,CONVERT(DATE,A.ResolutionTime,0) AS TO_DATE
FROM dbo.alerts A
)
SELECT
INC.DATE_VAL
,AL.AlertID
,DATEDIFF(HOUR,CASE
WHEN AL.IssueStartTime <= CONVERT(DATETIME,INC.DATE_VAL,0) THEN CONVERT(DATETIME,INC.DATE_VAL,0)
ELSE AL.IssueStartTime
END
,CASE
WHEN AL.ResolutionTime >= CONVERT(DATETIME,DATEADD(DAY,1,INC.DATE_VAL),0) THEN CONVERT(DATETIME,DATEADD(DAY,1,INC.DATE_VAL),0)
ELSE AL.ResolutionTime
END) AS HOURS_DOWN
,APPS.AppName
FROM INLINE_CALENDAR INC
LEFT OUTER JOIN ALERT_DAYS AL
ON INC.DATE_VAL BETWEEN AL.FROM_DATE AND AL.TO_DATE
LEFT OUTER JOIN dbo.Alert_Application APPSET
ON AL.alertid = APPSET.alertid
LEFT OUTER JOIN dbo.Applications APPS
ON APPSET.applicationid = APPS.applicationid;
June 15, 2017 at 2:06 am
You solution 1 works fine, the problem must be in your code that is drawing the chart.
If you add some SQL to your solution to do the group by to the tool, day level, and sum the hours down, it does give 24 for VeritVRU for the 30th April.
Unless there's something wrong with your Time table (but that's unlikely if you see the correct results from solution 2).
June 15, 2017 at 8:56 am
(1)
Eirikur: I think OUTER APPLY is only way to do accomplish task because I need either a 1 or 0 for every *hour* of the day to drive the sizing of both non-events and events.
Your solution returns NULLS on *days* where application was up and that is not graphable. My graph has 24 hours on Y axis, days-of-month on X, with applications on the bars.
Your results:
(2)
Alastair: Solution 2 works provided I run the ETL to populate the table that holds the dataset in one fell swoop:@FromDate = '2017-04-01 00:00:00.000'
and @ToDate = '2017-06-30 00:00:00.000'
But I need to be able to grow the dataset monthly, since the source data is so huge.
Validation:
--Solution 1 (when run one month at a time to produce report dataset) returns only 1 hour on April 30
SELECT TOP (1000) [Tool]
,[DayMonth]
,[MonthYear]
,[Year]
,[cte_start_date]
,[HoursDown]
FROM [Sandbox].[dbo].[OT]
where Tool = 'VeritVru' and MonthYear = 'April-2017'--and MonthYear = '30-Apr-2017'
order by cte_start_date desc;
--Solution 2 Gives all 24 hours
SELECT TOP (1000) [Tool]
,[DayMonth]
,[MonthYear]
,[Year]
,[cte_start_date]
,[HoursDown]
FROM [Sandbox].[dbo].[OTApp]
where Tool = 'VeritVru' and MonthYear = 'April-2017' --and MonthYear = '30-Apr-2017';
order by cte_start_date desc;
Expected results are only in Solution 2 results:
Solution 1 results
--Quote me
June 15, 2017 at 9:13 am
polkadot - Thursday, June 15, 2017 8:56 AMEirikur: I think OUTER APPLY is only way to do accomplish task because I need either a 1 or 0 for every *hour* of the day to drive the sizing of both non-events and events. Your solution returns NULLS on *days* where application was up and that is not graphable. My graph has 24 hours on Y axis, days-of-month on X, with applications on the bars.
Your results:
First of all, if you count the hours, your solution does not return the correct values, something you need to look into. Secondly, outer apply is another method of constructing a left outer join, definitely not needed in this case when the calculation can be done using the latter.
The NULL values can be substituted with a 0 if needed, an absolutely trivial change in the code.
😎
BTW do you have the DDL and the data for the the Time table?
June 15, 2017 at 9:30 am
Yes, I added Time tbl DDL to original post. I just finished editing my original reply.
I do have correct counts in Solution 2 and even in both Solution 1 and 2 IF I run them for whole time period Apr 1 - Jun 30, but my real source data is so huge I need to either run Solution 1 monthly or Solution 2 for whole time period that I need reported on, but one application at a time, otherwise the query will blow up.
I need to understand why Solution 1 is dropping counts at end of each month and if there's any way to correct for that so that I can run the query as an ETL query, monthly. Solution 2 will require a cursor script, which I want to avoid as it will take very long time to process.
I am looking for solution that can be run incrementally, something that can be grown incrementally.
Visuals created and shared on dropbox https://www.dropbox.com/s/18g90i5ico1ygh8/SQLCentral.pbix?dl=0
Page 1 = Solution 1 where I run query 3x, once for apr, may, jun results in low counts at end of each month
Page 2 = Solution 2, correct results
Page 3 = Solution 1 where I run query 1x for all 3 months, one app at a time, also produced correct results
Helen
--Quote me
June 15, 2017 at 9:48 am
Just to pile on, using the DDL you provided, both queries show 24 rows for the VeritVRU tool on April 30 when I run them.
If you're missing rows, there's more to the story than what we've been shown.
Cheers!
June 15, 2017 at 9:52 am
You don't have a 'top 1000' clause in your select statement when you are actually testing this do you?
June 15, 2017 at 10:25 am
Alastair, no ofcourse not. The sample dataset is so small that the top 1000 isn't a problem. In real life I am not using top 1000! i would be missing far more than just the tail end of every month if I did that.
Jacob, both queries show 24 rows if the dates set are April through June for both. However, I need to run the queries as ETL that builds the report dataset incrementally since in real life the source data is large. I want a query that I can run as ETL one month at a time.
I need to understand why Solution 1 is dropping records one the last day of every month when run 1 month at a time and if there is a fix for that.
--Quote me
June 15, 2017 at 10:36 am
polkadot - Thursday, June 15, 2017 10:25 AMAlastair, no ofcourse not. The sample dataset is so small that the top 1000 isn't a problem. In real life I am not using top 1000! i would be missing far more than just the tail end of every month if I did that.
Jacob, both queries show 24 rows if the dates set are April through June for both. However, I need to run the queries as ETL that builds the report dataset incrementally since in real life the source data is large. I want a query that I can run as ETL one month at a time.I need to understand why Solution 1 is dropping records one the last day of every month when run 1 month at a time and if there is a fix for that.
Ok, then what exactly are you setting the variables to for the runs of solution 1 and solution 2 when you have the problem?
I'm guessing it's going to be the mundane case that you're using midnight of the last day of the month for the per-month query (that's what you're doing in the samples above, except for a whole year), which means you miss out all of that day except for midnight.
To define the end of the range, you should do <(midnight of the first day of the next month) to get the whole month, so for April, >='20170401 00:00:00' AND <'20170501 00:00:00'.
Your posted queries would cut off the last day of June 2017 for both queries for a similar reason. Everything past 00:00:00 on the last day is excluded.
Cheers!
EDIT: Clarified a couple things.
June 15, 2017 at 10:52 am
(deleted)
--Quote me
June 15, 2017 at 10:58 am
Jacob....oh I sure hope this is mundane case! Heart rate quickened... here's what I have been doing:
1rst run
SET @FromDate = '2017-04-01 00:00:00.000'
SET @ToDate = '2017-04-30 00:00:00.000'
2nd run
SET @FromDate = '2017-05-01 00:00:00.000'
SET @ToDate = '2017-05-31 00:00:00.000'
3rd run
SET @FromDate = '2017-06-01 00:00:00.000'
SET @ToDate = '2017-06-30 00:00:00.000'
And are you saying I should do?:
SET @FromDate = '2017-04-01 00:00:00.000'
SET @ToDate = '2017-05-01 00:00:00.000'
SET @FromDate = '2017-05-01 00:00:00.000'
SET @ToDate = '2017-06-01 00:00:00.000'
SET @FromDate = '2017-06-01 00:00:00.000'
SET @ToDate = '2017-07-01 00:00:00.000'
--Quote me
June 15, 2017 at 11:01 am
polkadot - Thursday, June 15, 2017 10:58 AMJacob....oh I sure hope this is mundane case! Heart rate quickened... here's what I have been doing:1rst run
SET @FromDate = '2017-04-01 00:00:00.000'
SET @ToDate = '2017-04-30 00:00:00.000'2nd run
SET @FromDate = '2017-05-01 00:00:00.000'
SET @ToDate = '2017-05-31 00:00:00.000'3rd run
SET @FromDate = '2017-06-01 00:00:00.000'
SET @ToDate = '2017-06-30 00:00:00.000'And are you saying I should do?:
SET @FromDate = '2017-04-01 00:00:00.000'
SET @ToDate = '2017-05-01 00:00:00.000'SET @FromDate = '2017-05-01 00:00:00.000'
SET @ToDate = '2017-06-01 00:00:00.000'SET @FromDate = '2017-06-01 00:00:00.000'
SET @ToDate = '2017-07-01 00:00:00.000'
Yes, that is the cause of your problem.
You're removing the last 23 hours from the last day of the month.
The changes to the value of the variables you propose is correct.
Just remember to change from <= to < in the comparison to @ToDate, so you don't include midnight of the first day of the next month.
Cheers!
EDIT: Fixed some wording I didn't like.
June 15, 2017 at 12:54 pm
Would BETWEEN work?
WHERE cte_start_date BETWEEN @FromDate AND @ToDate
--Quote me
June 15, 2017 at 1:01 pm
No, BETWEEN @FromDate and @ToDate is equivalent to >=@FromDate and <=@ToDate.
You'll want to do >=@FromDate and <@ToDate with the variables set as suggested.
The nice thing about that approach is that you can always use midnight of the first of the next month as your end.
You don't have to change it to account for different ending days for different months, or worry about changes in precision of stored datetimes in the future.
Check out http://sqlblog.com/blogs/aaron_bertrand/archive/2011/10/19/what-do-between-and-the-devil-have-in-common.aspx for a quick read about that.
Cheers!
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply