December 21, 2020 at 5:30 pm
Dear Experts
I have a table that accumulates daily stats, defined so:
CREATE TABLE [dbo].[Stats2]
([StatsDate] [datetime] PRIMARY KEY,
[TypeA_Calls] [int] NULL,
[TypeB_Calls] [int] NULL,
[TypeC_Calls] [int] NULL);
I need a SELECT statement, or sequence of statements, which can pull out the earliest and latest date on which each stat is nonzero. This is of course would be easy to do if there were only one stat (using the TypeA_calls field as an example):
SELECT
min(statsdate) as First_TypeA_Date,
max(statsdate) as Last_TypeA_Date
from dbo.Stats
where typea_calls>0;
However, I cannot see how to generalize this to multiple columns of stats (I have only defined 3 columns in my example table for brevity, but in the real-world table there are about 20 of them!). The complicating factor seems to be that each of the WHERE clauses has to be tailored to the specific column.
I could of course have 3 (or 20-odd) SELECT statements, each putting one column’s worth of stats into a temporary table and then reporting from that, but I can’t help thinking there must be a way of avoiding 3 (or 20-odd) logical table scans.
Or maybe I could do this using a cursor… ?
Yours, seeking enlightenment.
MarkD
December 21, 2020 at 6:12 pm
drop table if exists #Stats2;
go
CREATE TABLE [dbo].#Stats2
([StatsDate] [datetime] PRIMARY KEY,
[TypeA_Calls] [int] NULL,
[TypeB_Calls] [int] NULL,
[TypeC_Calls] [int] NULL);
insert #Stats2 values
('2020-12-01', null, null, 23),
('2020-12-02', 1, null, 23),
('2020-12-03', 2, null, 23),
('2020-12-04', null, 4, null),
('2020-12-05', null, 2, 47),
('2020-12-06', 3, 3, 3);
select min(case when TypeA_Calls is not null then StatsDate else null end) min_a,
max(case when TypeA_Calls is not null then StatsDate else null end) max_a,
min(case when TypeB_Calls is not null then StatsDate else null end) min_b,
max(case when TypeB_Calls is not null then StatsDate else null end) max_b,
min(case when TypeC_Calls is not null then StatsDate else null end) min_c,
max(case when TypeC_Calls is not null then StatsDate else null end) max_c
from #Stats2;
Output
min_amax_amin_bmax_bmin_cmax_c
2020-12-02 00:00:00.0002020-12-06 00:00:00.0002020-12-04 00:00:00.0002020-12-06 00:00:00.0002020-12-01 00:00:00.0002020-12-06 00:00:00.000
Alternately the CASE expressions (or IIF functions) could be in the FROM clause
select min(dtm.a) min_a, max(dtm.a) max_a,
min(dtm.b) min_b, max(dtm.b) max_b,
min(dtm.c) min_c, max(dtm.c) max_c
from #Stats2
cross apply (values (iif(TypeA_Calls is not null, StatsDate, null),
iif(TypeB_Calls is not null, StatsDate, null),
iif(TypeC_Calls is not null, StatsDate, null))) dtm(a,b,c);
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
December 21, 2020 at 6:19 pm
Can you change the data model? A more EAV-like data model might be more appropriate in this case -- e.g.,
CREATE TABLE [dbo].[StatsDefinitions]
(
[StatsID] [int] NOT NULL PRIMARY KEY,
[StatsName] varchar(30) NOT NULL,
[StatsDescrtiption] varchar(255) NOT NULL
);
CREATE TABLE [dbo].[Stats]
([StatsDate] [datetime] NOT NULL,
[StatsID] [int] NULL,
[StatsValue] [int] NULL)
PRIMARY KEY PK_Stats (StatsDate ASC, StatsID ASC)
);
Then your select looks like this:
SELECT
StatsDefinition.StatsID,
StatsDefinition.StatsName,
min(statsdate) as First_Stat_Date,
max(statsdate) as Last_Stat_Date
from dbo.Stats
INNER JOIN dbo.StatsDefinitions ON StatsDefinitions.StatsID = Stats.StatsID
where Stats.StatsValue > 0;
If this has to be flattened into 60 First/Last columns -- First_Type<N>_Date, Last_Type<N>_Date -- then do you return any row in which any of the 30 values is greater than zero?
December 21, 2020 at 10:18 pm
Two - no, three - really instructive answers here.
I think Steve Collins' first suggestion was the sort of thing I had at the back of my mind. That CROSS APPLY was nifty though. I wonder which one scales better?
I agree with ratbak about using a more EAV approach, but I suspect that external circumstances will disallow this. I don't know for sure as this question came from a colleague.
Re ratbak's question, if I understand it, I think the answer is yes - that is, if any of the stats for a particular day is non-zero, we would want to display all of them in cross-tab fashion, otherwise omit that day.
Many thanks
MarkD
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply