April 25, 2014 at 4:03 am
I am new with CTE and need help from you guys.
I have a table FeedFileDetails which stores all the zipfile properties data (like, pakagename, universe, zipfilesize, zipfiledate..) I am using the below code to calculate the sum and average of size of similar zipfiles for past 14 days. I have hardcoded the number 14 for the calculation of average for past 14 days but now want this to be dynamically updated (just like counter).
The problem is if sometimes the file is not generated for a particular day then the sum of the zipfiles will be for 13 days but the average will be wrong (as the denominator is hardcoded as 14 in the code).If there are only 13 files generated in the past 14 days then the denominator should be 13 not 14.
DECLARE @current_date DATETIME
DECLARE @previous_date DATETIME
DECLARE @curr_fourteen DATETIME
SELECT @current_date = MAX(filedate) FROM FeedFileDetails
SELECT @previous_date = MAX(filedate)-1 FROM FeedFileDetails
SELECT @curr_fourteen = @current_date-14
;WITH dates(DateValue)
AS(
SELECT @curr_fourteen AS DateValue
UNION ALL
SELECT DATEADD(DAY,1,DateValue)
FROM dates
WHERE DateValue < @previous_date
),
sumofsize(pkgname,universe,sum_zipfilesize)
AS(
SELECT DISTINCT f.pkgname, f.universe, SUM(f.zipfilesize)
FROM FeedFileDetails f INNER JOIN dates d
ON d.DateValue = f.filedate
GROUP BY f.pkgname, f.universe
),
finalTable(pkgname,universe,filedate,ZipFileSize,sum_zipfilesize,avg_zipfilesize)
AS(
SELECT DISTINCT a.pkgname
,a.universe
,@current_date AS filedate
,f.zipfilesize AS ZipFileSize
,sum_zipfilesize
,sum_zipfilesize/14 as avg_zipfilesize
FROM sumofsize a INNER JOIN FeedFileDetails f
ON f.pkgname = a.pkgname and f.universe = a.universe
WHERE f.filedate = @previous_date
)
April 25, 2014 at 4:38 am
i think you can get count of filename like you derived SUM(f.zipfilesize) ,and use it in final CTE finalTable
April 25, 2014 at 5:44 am
As Megha said, grab the count of your files.
--You only need one DECLARE statement
-- and you can assign the values within that statement too.
-- Saves on typing
DECLARE @current_date DATETIME = (SELECT MAX(filedate) FROM FeedFileDetails),
@previous_date DATETIME = (SELECT MAX(filedate)-1 FROM FeedFileDetails),
@curr_fourteen DATETIME = (SELECT MAX(filedate)-14 FROM FeedFileDetails),
@AvgFileCnt INT;
SET @AvgFileCnt = (SELECT COUNT(FileDate) FROM FeedFileDetails
WHERE CONVERT(CHAR(10),FileDate,101)
>= CONVERT(CHAR(10),@current_date - 14,101));
--Grabs the number of files where the file date is 14 or less days
This code is only partially tested because I don't have a setup like you do. So make sure to test it before pushing it to any production environment.
April 25, 2014 at 8:54 am
Count of files doesn't help as there can be multiple files generated in a day.
April 25, 2014 at 8:56 am
Then what are you averaging by?
April 25, 2014 at 9:31 am
averaging by the number of days on which the files were generated.
Note : There can be one or more days when no files are generated.
Lets take an example:
If there are total 35 files generated in past 14 days then AVG function will take the sum of all the 35 files and will divide with 35. Instead I want it to divide it by the number of days i.e. 14.
One case could be if there are 40 files generated in past 14 days but no files were generated on a particular day then the logic should calculate the sum of 40 files but divide it by the number of days on which the files were generated i.e. 13 instead of 14 (since files were not generated for a day ).
Let me know if I am still unclear.
April 25, 2014 at 9:44 am
It would help if you would post the DDL (CREATE TABLE statement) for the table(s) involved in your query, some sample data for the table(s) (as INSERT INTO statements), expected results based on the sample data, and most importantly all the code for your query (if you look closely at your initial post you are missing the last part of your query).
I think I understand what it is you are trying to accomplish but we need the additional info to really provide you a good, tested answer.
April 25, 2014 at 10:53 am
If you throw a DISTINCT into the code I provided, it solves your problem.
But as Lynn said, we can't provide you with a definite answer without some DDL, sample data, and the rest of your query.
April 25, 2014 at 5:36 pm
Brandie Tarvin (4/25/2014)
--You only need one DECLARE statement
-- and you can assign the values within that statement too.
[font="Arial Black"]-- Saves on typing[/font]
"Sure you are? Dark Side powerful. Lure you it will."
😉
--===== Conditionally drop the Test Table to make reruns in SSMS easier
IF OBJECT_ID('tempdb..#FeedFileDetails','U') IS NOT NULL
DROP TABLE #FeedFileDetails
;
--===== Create and populate the Test Table on-the-fly
SELECT TOP 100000
FileDate = DATEADD(yy,'2010'-1900,RAND(CHECKSUM(NEWID()))*DATEDIFF(dd,'2010','2014'))
INTO #FeedFileDetails
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.all_columns ac2
;
GO
PRINT '--===== "Save typing" method (221 characters NOT including spaces/tabs/print/stats/go)'
SET STATISTICS TIME,IO ON;
DECLARE @current_date DATETIME = (SELECT MAX(filedate) FROM #FeedFileDetails),
@previous_date DATETIME = (SELECT MAX(filedate)-1 FROM #FeedFileDetails),
@curr_fourteen DATETIME = (SELECT MAX(filedate)-14 FROM #FeedFileDetails),
@AvgFileCnt INT;
SET STATISTICS TIME,IO OFF;
GO
PRINT '--===== Traditional method (207 characters NOT including spaces/tabs/print/stats/go)'
--{hold-alt, click-drag, paste} does most of the rest.
SET STATISTICS TIME,IO ON;
DECLARE @current_date DATETIME,
@previous_date DATETIME,
@curr_fourteen DATETIME,
@AvgFileCnt INT
;
SELECT @current_date = MAX(FileDate),
@previous_date = MAX(FileDate)-1,
@curr_fourteen = MAX(FileDate)-14
FROM #FeedFileDetails;
SET STATISTICS TIME,IO OFF;
GO
--===== "Save typing" method (221 characters NOT including spaces/tabs/print/stats/go)
Table '#FeedFileDetails____________________________________________________________________________________________________00000000094B'.
Scan count 3, logical reads 636, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 109 ms, elapsed time = 102 ms.
--===== Traditional method (207 characters NOT including spaces/tabs/print/stats/go)
Table '#FeedFileDetails____________________________________________________________________________________________________00000000094B'.
Scan count 1, logical reads 212, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 31 ms.
"Type once. Rest Force builds, it will". 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
April 28, 2014 at 4:22 am
Jeff Moden (4/25/2014)
Brandie Tarvin (4/25/2014)
--You only need one DECLARE statement
-- and you can assign the values within that statement too.
[font="Arial Black"]-- Saves on typing[/font]
"Sure you are? Dark Side powerful. Lure you it will."
😉
Come to the Dark Side. We have cookies!
"Type once. Rest Force builds, it will". 🙂
YES! I OWN YOU NOW! :w00t:
EDIT: Of course, I replied to the geekdomnis of your reply before reading the stats. That's interesting... I probably need to do more of those checks when I'm coding. Thanks for pointing that out.
April 28, 2014 at 7:50 am
Would be nice if we could hear more from the OP. It seems like he just disappeared.
April 28, 2014 at 8:31 am
Brandie Tarvin (4/28/2014)
Come to the Dark Side. We have cookies!
Heh... Mongo like cookies! 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
April 28, 2014 at 8:59 am
Hey Guys,
I got the solution. It was just adding a count of distinct filedate and then dividing the total filesize with that. Very bad on my part that I had to ask such a silly question.
Btw, thanks for all your help and efforts. 🙂
sumofsize(pkgname,universe,sum_zipfilesize, cntDays)
AS(
SELECT f.pkgname, f.universe, SUM(f.zipfilesize),
COUNT(Distinct(FileDate)) as cntDays
FROM FeedFileDetails f INNER JOIN dates d
ON d.DateValue = f.filedate
GROUP BY f.pkgname, f.universe
),
finalTable(pkgname,universe,filedate,ZipFileSize,sum_zipfilesize,avg_zipfilesize)
AS(
SELECT a.pkgname
,a.universe
,@current_date AS filedate
,f.zipfilesize AS ZipFileSize
,sum_zipfilesize
,sum_zipfilesize/cntDays as avg_zipfilesize -- the name might have confused here; it should have been Average Total Size per day)
FROM sumofsize a INNER JOIN FeedFileDetails f
ON f.pkgname = a.pkgname and f.universe = a.universe
WHERE f.filedate = @previous_date
April 28, 2014 at 9:01 am
not disappeared yet.. was just out of reach of internet 😛
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply