August 5, 2020 at 3:14 am
I'm new to SQL and I'm struggling to return a result set which I worked on for a long time:
Basically, I am trying to count and include active accounts per day over a timeframe (@params determine the time frame window). But the result must specify per day the number of active accounts. My struggle is with this: how to include accounts per day? Is it using dynamic SQL? if so, how?
/*Requirement:
Count and include active accounts for per day:
1- An active account is one that has an start date PRIOR to or equal to the date displayed yet, the endDate is after the same date
2- If an account has an enddate of NULL that means it is active (assuming it has an start date that falls in the range)
Example: an account with start date of 2020-03-01 and end date of 2020-03-04 would be considered active on
2020-03-01, 02, 03 and 04. It should be included/counted in those days.
But not on 2020-02-28 or 2020-03-05.
*/
---GROUP BY DATE
SELECT
A.Date,
A.Location,
count(*) as ActiveAccs ,
CountOfAccounts as numOfAccs,
ServiceCategory ,
CountOfAccounts-count(1) as AvailableAccs,
FROM accLocationStats A
LEFT OUTER JOIN Accounts B
ON A.Location=B.Location
WHERE b.StartDate is not null --StartDate of Null = they are unwanted accounts
--this is where things go bad. Do I need dynamic SQL?
AND ( b.EndDate is NULL)
--Examination window: we want accounts in this time frame . Please remember that I need to count per day, NOT count for the entire period.
AND (Date>=@param1 AND Date<=@param2)
GROUP BY A.Date,a.Location,CountOfAccounts,ServiceCategory
August 5, 2020 at 4:45 am
The most likely reason you're having a hard time is that you're missing a Calendar table. I borrowed some code from Dwain Camps for it. I ran this part of his code to create the Calendar table-valued function:
CREATE FUNCTION [dbo].[GenerateCalendar]
(
@FromDate DATETIME
,@NoDays INT
)
-- Generates a calendar table with sequential day numbering (@FromDate = SeqNo 1).
-- See RETURNS table (comments) for meaning of each column.
-- Notes: 1) Max for NoDays is 65536, which runs in just over 2 seconds.
--
-- Example calls to generate the calendar:
-- 1) Forward for 365 days starting today:
-- DECLARE @Date DATETIME
-- SELECT @Date = GETDATE()
-- SELECT *
-- FROM dbo.GenerateCalendar(@Date, 365)
-- ORDER BY SeqNo;
-- 2) Backwards for 365 days back starting today:
-- DECLARE @Date DATETIME
-- SELECT @Date = GETDATE()
-- SELECT *
-- FROM dbo.GenerateCalendar(@Date, -365)
-- ORDER BY SeqNo;
-- 3) For only the FromDate:
-- DECLARE @Date DATETIME
-- SELECT @Date = GETDATE()
-- SELECT *
-- FROM dbo.GenerateCalendar(@Date, 1);
-- 4) Including only the last week days of each month:
-- Note: Seq no in this case are as if all dates were generated
-- DECLARE @Date DATETIME
-- SELECT @Date = GETDATE()
-- SELECT *
-- FROM dbo.GenerateCalendar(@Date, 365)
-- WHERE Last = 1 ORDER BY SeqNo;
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== High speed code provided courtesy of SQL MVP Jeff Moden (idea by Dwain Camps)
--===== Generate sequence numbers from 1 to 65536 (credit to SQL MVP Itzik Ben-Gen)
WITH E1(N) AS (SELECT 1 UNION ALL SELECT 1), --2 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --4 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --16 rows
E8(N) AS (SELECT 1 FROM E4 a, E4 b), --256 rows
E16(N) AS (SELECT 1 FROM E8 a, E8 b), --65536 rows
cteTally(N) AS (
SELECT TOP (ABS(@NoDays)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E16)
-- [SeqNo]=Sequential day number (@FromDate always=1) forward or backwards
SELECT [SeqNo] = t.N,
-- [Date]=Date (with 00:00:00.000 for the time component)
[Date] = dt.DT,
-- [Year]=Four digit year
[Year] = dp.YY,
-- [YrNN]=Two digit year
[YrNN] = dp.YY % 100,
-- [YYYYMM]=Integer YYYYMM (year * 100 + month)
[YYYYMM] = dp.YY * 100 + dp.MM,
-- [BuddhaYr]=Year in Buddhist calendar
[BuddhaYr] = dp.YY + 543,
-- [Month]=Month (as an INT)
[Month] = dp.MM,
-- [Day]=Day (as an INT)
[Day] = dp.DD,
-- [WkDNo]=Week day number (based on @@DATEFIRST)
[WkDNo] = DATEPART(dw,dt.DT),
-- Next 3 columns dependent on language setting so may not work for non-English
-- [WkDName]=Full name of the week day, e.g., Monday, Tuesday, etc.
[WkDName] = CONVERT(NCHAR(9),dp.DW),
-- [WkDName2]=Two characters for the week day, e.g., Mo, Tu, etc.
[WkDName2] = CONVERT(NCHAR(2),dp.DW),
-- [WkDName3]=Three characters for the week day, e.g., Mon, Tue, etc.
[WkDName3] = CONVERT(NCHAR(3),dp.DW),
-- [JulDay]=Julian day (day number of the year)
[JulDay] = dp.DY,
-- [JulWk]=Week number of the year
[JulWk] = dp.DY/7+1,
-- [WkNo]=Week number
[WkNo] = dp.DD/7+1,
-- [Qtr]=Quarter number (of the year)
[Qtr] = DATEPART(qq,dt.Dt),
-- [Last]=Number the weeks for the month in reverse
[Last] = (DATEPART(dd,dp.LDtOfMo)-dp.DD)/7+1,
-- [LdOfMo]=Last day of the month
[LdOfMo] = DATEPART(dd,dp.LDtOfMo),
-- [LDtOfMo]=Last day of the month as a DATETIME
[LDtOfMo] = dp.LDtOfMo
FROM cteTally t
CROSS APPLY
( --=== Create the date
SELECT DT = DATEADD(dd,(t.N-1)*SIGN(@NoDays),@FromDate)
) dt
CROSS APPLY
( --=== Create the other parts from the date above using a "cCA"
-- (Cascading CROSS APPLY (cCA), courtesy of Chris Morris)
SELECT YY = DATEPART(yy,dt.DT),
MM = DATEPART(mm,dt.DT),
DD = DATEPART(dd,dt.DT),
DW = DATENAME(dw,dt.DT),
Dy = DATEPART(dy,dt.DT),
LDtOfMo = DATEADD(mm,DATEDIFF(mm,-1,dt.DT),-1)
) dp;
And this is my code that uses it...
use tempdb;
go
/* setup */
CREATE TABLE UserAccount (
AccountName VARCHAR(20) PRIMARY KEY,
StartDate DATE NOT NULL,
EndDate DATE DEFAULT NULL
);
GO
-- add some dummy records
INSERT INTO UserAccount VALUES ('AccountA','5/1/2020','8/1/2020')
,('AccountB','3/1/2020',null)
,('AccountC','7/1/2020',null)
,('AccountD','7/1/2020','8/4/2020');
Now I can use Dwain's code to answer the question ... (his code basically takes a start date and an end date and generates a calendar with all the dates in between). Once you have that, the question becomes super easy, because now I can group by the values in the Calendar table:
SELECT cal.[Date]
, ActiveAccountsCount = COUNT(*)
FROM UserAccount ua
INNER JOIN
(SELECT [Date]
FROM Testdb.dbo.GenerateCalendar('3/1/2020',DATEDIFF(day,'3/1/2020','8/10/2020'))) cal
ON cal.Date>= ua.StartDate AND cal.Date <=ua.EndDate
GROUP BY cal.[Date]
ORDER BY cal.[Date];
The part to notice is the join on the Calendar table... it basically says "show me/expand the date ranges between each StartDate and EndDate". Then I just group by that date. and get a Count. Once you have all the pieces, this question becomes super easy.
August 5, 2020 at 10:56 am
Thank you pietlinden for this great response. Actually, the struggle was also conceptual and logistical and I am still struggling:
Conceptual:
In this query, I am getting the counts for the entire period (defined by @Start and @End dates). What I am struggling with conceptually is that I want to provide counts per day (daily counts), not for the entire period. Say if active accounts from Feb to March are 200, but from Feb/3rd to 9th it was only 179, how is this query going to work when the condition specified is parameter dates that define the range? So I am still not sure how to apply the concept/condition of "use only active accounts on the day you are grouping by" - that is only count the active accounts for one day that you are joining by. Wouldn't you need a condition for that??
Logistical:
Once you add this additional date, wouldn't I need to add it the grouping? There are already two dates that I've added to the grouping, so this would be the third?
In addition, I am not sure why my previous group even worked because I had numbers and calculations that were not included in the GROUP BY clause.
Thank you again for your help
August 5, 2020 at 12:08 pm
This uses the daterange function described here:
https://www.sqlservercentral.com/scripts/a-daterange-table-valued-function
(the actual daterange code used is closest to this one)
drop table if exists dbo.TestUserAccount;
go
create table dbo.TestUserAccount(
AccountName varchar(20) primary key,
StartDate date not null,
EndDate date null);
go
insert dbo.TestUserAccount(AccountName, StartDate, EndDate) values
('AccountA','5/1/2020','8/1/2020')
,('AccountB','3/1/2020',null)
,('AccountC','7/1/2020',null)
,('AccountD','7/1/2020','8/4/2020');
declare
@dt_range_start date='2020-02-28',
@dt_range_end date='2020-06-28';
;with active_accts_cte(dt_active, active_count) as (
select cast(dr.[value] as date), count(*)
from
dbo.TestUserAccount tua
cross apply
dbo.daterange(tua.StartDate, isnull(tua.EndDate, @dt_range_end), 'dd', 1) dr
group by cast(dr.[value] as date))
select
a.[date], a.[location], a.CountOfAccounts, a.ServiceCategory,
sum(isnull(b.active_count, 0)) ActiveAccs,
a.CountOfAccounts-sum(isnull(b.active_count, 0)) AvailableAccs
from
accLocationStats a
left join
active_accts_cte b ON a.[date]=b.dt_active
where
a.[Date]=@dt_range_start
and a.[Date]<=@dt_range_end
group by
a.[date], a.[location], a.CountOfAccounts, a.ServiceCategory;
/* drop test table */
drop table dbo.TestUserAccount;
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
August 5, 2020 at 1:42 pm
If the solutions provided do not match up with your requirement, please consider providing some sample data, along with desired results, based on that sample data.
When providing the sample data, please do so in a form which can be cut & pasted into SSMS, for others to use. The above post is an example of how to do this.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
August 5, 2020 at 4:38 pm
Thank you for the great responses. I think there is a miscommunication here:
now the query works, but I am not sure if I understand it:
Conceptual:
In this query, I am getting the counts for the entire period (defined by @Start and @End dates). What I am struggling with conceptually is that I want to provide counts per day (daily counts), not for the entire period. Say if active accounts from Feb to March are 200, but from Feb/3rd to 9th it was only 179, how is this query going to work when the condition specified is parameter dates that define the range? So I am still not sure how to apply the concept/condition of "use only active accounts on the day you are grouping by" - that is only count the active accounts for one day that you are joining by. Wouldn't you need a condition for that??
Logistical:
Once you add this additional date, wouldn't I need to add it the grouping? There are already two dates that I've added to the grouping, so this would be the third?
In addition, I am not sure why my previous group even worked because I had numbers and calculations that were not included in the GROUP BY clause.
So while I am getting answers, I am not sure if I understand how this is working.
August 5, 2020 at 5:39 pm
You're dead right that there's a communication issue. I cannot relate the questions you are asking to whatever query you have, which 'works'.
Please identify exactly those parts of this working query which you cannot fathom, rather than regurgitating long chunks of prose and it will be easier for people to zero in.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
August 11, 2020 at 2:17 pm
Here's something that should work for you, using the UserAccount table and values that Steve Collins posited above..
CREATE TABLE UserAccount
(
AccountName VARCHAR(20) PRIMARY KEY
, StartDate DATE NOT NULL
, EndDate DATE DEFAULT NULL
);
CREATE TABLE #calendar (calDate DATE);
-- add some dummy records
INSERT INTO UserAccount
VALUES
('AccountA', '5/1/2020', '8/1/2020')
, ('AccountB', '3/1/2020', NULL)
, ('AccountC', '7/1/2020', NULL)
, ('AccountD', '7/1/2020', '8/4/2020');
DECLARE @calDate DATE = @param1;
WHILE @calDate <= @param2
BEGIN
INSERT INTO #calendar (calDate)
VALUES
(@calDate);
SET @calDate = DATEADD(DAY, 1, @calDate);
END;
SELECT
c.calDate
, COUNT(u.AccountName)
FROM #calendar c
LEFT OUTER JOIN #UserAccount u ON c.calDate BETWEEN u.StartDate AND u.EndDate
OR (u.StartDate <= c.calDate AND u.EndDate IS NULL)
GROUP BY c.calDate;
DROP TABLE #calendar;
The outer join will give at least one row per date in the #calendar table: one with each AccountName active on that date, or one row with a NULL AccountName for days with no active accounts. Specifying the AccountName column in the COUNT() function tells it not to count rows with a NULL value in that field, so this will give you the results you're looking for.
August 11, 2020 at 2:22 pm
Wait ... monotonically INSERTing using a WHILE loop? Brave of you to post that here!
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
August 11, 2020 at 2:26 pm
Fair point. I was trying out my solution on a test database that didn't have a prepared Calendar table, so that was my quick hack. I should have cleaned that up and assumed an existing Calendar table when I posted my possible solution.
August 11, 2020 at 6:38 pm
Fair point. I was trying out my solution on a test database that didn't have a prepared Calendar table, so that was my quick hack. I should have cleaned that up and assumed an existing Calendar table when I posted my possible solution.
On that note and just as a suggestion, you should equip your test database(s) (or a general purpose "util" database) with a copy of the "Swiss Army Knife" of T-SQL... an fnTally function. You can find such a function at the similarly named link in my signature line below. Then your quick hack becomes really quick. 😀
INSERT INTO #calendar WITH (TABLOCK)
(calDate)
SELECT calDate = DATEADD(dd,t.n,@Param1)
FROM dbo.fnTally(0,DATEDIFF(dd,@Param1,@Param2)) t
ORDER BY calDate
;
With a little practice in using it, you'll find it's really handy for a shedload of other "quick" and "production worthy" hacks.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 11, 2020 at 7:07 pm
dan.bridgeman wrote:Fair point. I was trying out my solution on a test database that didn't have a prepared Calendar table, so that was my quick hack. I should have cleaned that up and assumed an existing Calendar table when I posted my possible solution.
On that note and just as a suggestion, you should equip your test database(s) (or a general purpose "util" database) with a copy of the "Swiss Army Knife" of T-SQL... an fnTally function. You can find such a function at the similarly named link in my signature line below. Then your quick hack becomes really quick. 😀
INSERT INTO #calendar WITH (TABLOCK)
(calDate)
SELECT calDate = DATEADD(dd,t.n,@Param1)
FROM dbo.fnTally(0,DATEDIFF(dd,@Param1,@Param2)) t
ORDER BY calDate
;With a little practice in using it, you'll find it's really handy for a shedload of other "quick" and "production worthy" hacks.
That looks very handy, indeed. Thanks for writing and sharing it!
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply