May 4, 2012 at 3:15 pm
Right, will probably have another look either tomorrow or Monday but for now I'm calling it a night.
Realised I didn't need the min or max over() so have got rid a bit of the cost.
DECLARE @ImportType INT = 7001, @ImportID CHAR(2) = '03',
@ImpDateFrom DATETIME = '2011-12-01', @ImpDateTo DATETIME = '2011-12-31',
@DateFrom DATETIME = '2010-01-01', @DateTo DATETIME = '2011-12-31',
@FileType VARCHAR(25) = '.XLS,.DOT,.DOC,.PPT,.MBD';
--SET STATISTICS IO ON;
--SET STATISTICS TIME ON;
WITH CTE(n) AS(SELECT 1 UNION ALL SELECT 1),
CTE2(n) AS (SELECT 1 FROM CTE x, CTE y),
CTE3(n) AS (SELECT 1 FROM CTE2 x, CTE2 y),
CTE4(n) AS (SELECT 1 FROM CTE3 x, CTE3 y),
CTE5(n) AS (SELECT 0 UNION ALL
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM CTE4 x, CTE4 y)
SELECT
DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', DATEADD(MONTH,a.n,'1900-01-01')), 0) AS modifiedMonth,
SUM(IsAccess) AS AccessCount,
SUM(IsExcel) AS ExcelCount,
SUM(IsWord) AS WordCount,
SUM(IsPowerPoint) AS PowerPointCount
FROM CTE5 a
LEFT OUTER JOIN (SELECT
f.FileName,
f.LastModified,
DATEADD(MONTH, DATEDIFF(MONTH, 0, f.LastModified), 0) AS monthModified,
f.FileType,
sf.ImportType,
sf.ImportID,
ft.Application,
CASE WHEN ft.Application = 'Access' THEN 1 ELSE 0 END AS IsAccess,
CASE WHEN ft.Application = 'Excel' THEN 1 ELSE 0 END AS IsExcel,
CASE WHEN ft.Application = 'Word' THEN 1 ELSE 0 END AS IsWord,
CASE WHEN ft.Application = 'PowerPoint' THEN 1 ELSE 0 END AS IsPowerPoint
FROM dbo.Files f
INNER JOIN dbo.Groups g ON f.GroupID = g.GroupID
INNER JOIN dbo.SourceFiles sf ON g.BatchID = sf.BatchID
INNER JOIN dbo.Import i ON sf.ImportID = i.ImportID
INNER JOIN dbo.FileTypes ft ON f.FileType = ft.FileExtension
WHERE f.FileType IN (SELECT * FROM dbo.fn_Split(@FileType, ',')) AND
f.LastModified > dateadd(day, - 1, CONVERT(DATETIME, @DateFrom, 102)) AND
f.LastModified < dateadd(day, 1, CONVERT(DATETIME, @DateTo, 102)) AND
sf.ImportDate > dateadd(day, - 1, CONVERT(DATETIME, @ImpDateFrom, 102)) AND
sf.ImportDate < dateadd(day, 1, CONVERT(DATETIME, @ImpDateTo, 102)) AND
CHARINDEX(dbo.PADL(sf.ImportId, 2, '0'), @ImportID) > 0 AND
sf.ImportType = @ImportType) b ON DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', DATEADD(M,a.n,'1900-01-01')), 0) = b.monthModified
WHERE b.FileName IS NOT NULL
GROUP BY DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', DATEADD(M,a.n,'1900-01-01')), 0)
ORDER BY DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', DATEADD(M,a.n,'1900-01-01')), 0);
--SET STATISTICS TIME OFF;
--SET STATISTICS IO OFF;
modifiedMonth AccessCount ExcelCount WordCount PowerPointCount
----------------------- ----------- ----------- ----------- ---------------
2010-01-01 00:00:00.000 0 27 1 0
2010-02-01 00:00:00.000 0 16 5 0
2010-03-01 00:00:00.000 0 86 1 0
2010-04-01 00:00:00.000 0 6 12 14
2010-05-01 00:00:00.000 0 1 0 0
2010-06-01 00:00:00.000 0 0 0 2
2010-07-01 00:00:00.000 0 1 0 0
2010-09-01 00:00:00.000 0 701 0 0
2010-10-01 00:00:00.000 0 34 0 0
2010-11-01 00:00:00.000 0 14 10 3
2011-03-01 00:00:00.000 0 9 706 704
2011-05-01 00:00:00.000 0 21 1 2
2011-06-01 00:00:00.000 0 8 3 0
2011-07-01 00:00:00.000 0 0 1 0
2011-08-01 00:00:00.000 0 6 0 0
2011-09-01 00:00:00.000 0 13 6 7
2011-10-01 00:00:00.000 0 1054 18 1
2011-11-01 00:00:00.000 0 0 2 0
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'FileTypes'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Files'. Scan count 14, logical reads 100, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Groups'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Import'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SourceFiles'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#4CF5691D'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 62 ms, elapsed time = 52 ms.
I'm sure we can squeeze more out it, shouldn't really be taking this long with the amount of data involved.
CREATE FUNCTION dbo.DelimitedSplit8K--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 0 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "zero base" and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT 0 UNION ALL
SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT t.N+1
FROM cteTally t
WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY s.N1),
Item = SUBSTRING(@pString,s.N1,ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000))
FROM cteStart s
;
DECLARE @ImportType INT = 7001, @ImportID CHAR(2) = '03',
@ImpDateFrom DATETIME = '2011-12-01', @ImpDateTo DATETIME = '2011-12-31',
@DateFrom DATETIME = '2010-01-01', @DateTo DATETIME = '2011-12-31',
@FileType VARCHAR(25) = '.XLS,.DOT,.DOC,.PPT,.MBD';
--SET STATISTICS IO ON;
--SET STATISTICS TIME ON;
WITH CTE(n) AS(SELECT 1 UNION ALL SELECT 1),
CTE2(n) AS (SELECT 1 FROM CTE x, CTE y),
CTE3(n) AS (SELECT 1 FROM CTE2 x, CTE2 y),
CTE4(n) AS (SELECT 1 FROM CTE3 x, CTE3 y),
CTE5(n) AS (SELECT 0 UNION ALL
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM CTE4 x, CTE4 y)
SELECT
DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', DATEADD(MONTH,a.n,'1900-01-01')), 0) AS modifiedMonth,
SUM(IsAccess) AS AccessCount,
SUM(IsExcel) AS ExcelCount,
SUM(IsWord) AS WordCount,
SUM(IsPowerPoint) AS PowerPointCount
FROM CTE5 a
LEFT OUTER JOIN (SELECT
f.FileName,
f.LastModified,
DATEADD(MONTH, DATEDIFF(MONTH, 0, f.LastModified), 0) AS monthModified,
f.FileType,
sf.ImportType,
sf.ImportID,
ft.Application,
CASE WHEN ft.Application = 'Access' THEN 1 ELSE 0 END AS IsAccess,
CASE WHEN ft.Application = 'Excel' THEN 1 ELSE 0 END AS IsExcel,
CASE WHEN ft.Application = 'Word' THEN 1 ELSE 0 END AS IsWord,
CASE WHEN ft.Application = 'PowerPoint' THEN 1 ELSE 0 END AS IsPowerPoint
FROM dbo.Files f
INNER JOIN dbo.Groups g ON f.GroupID = g.GroupID
INNER JOIN dbo.SourceFiles sf ON g.BatchID = sf.BatchID
INNER JOIN dbo.Import i ON sf.ImportID = i.ImportID
INNER JOIN dbo.FileTypes ft ON f.FileType = ft.FileExtension
CROSS APPLY dbo.DelimitedSplit8k(@FileType,',') split
WHERE ft.FileExtension = split.Item AND
f.LastModified > dateadd(day, - 1, CONVERT(DATETIME, @DateFrom, 102)) AND
f.LastModified < dateadd(day, 1, CONVERT(DATETIME, @DateTo, 102)) AND
sf.ImportDate > dateadd(day, - 1, CONVERT(DATETIME, @ImpDateFrom, 102)) AND
sf.ImportDate < dateadd(day, 1, CONVERT(DATETIME, @ImpDateTo, 102)) AND
CHARINDEX(dbo.PADL(sf.ImportId, 2, '0'), @ImportID) > 0 AND
sf.ImportType = @ImportType) b ON DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', DATEADD(M,a.n,'1900-01-01')), 0) = b.monthModified
WHERE b.FileName IS NOT NULL
GROUP BY DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', DATEADD(M,a.n,'1900-01-01')), 0)
ORDER BY DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', DATEADD(M,a.n,'1900-01-01')), 0);
--SET STATISTICS TIME OFF;
--SET STATISTICS IO OFF;
modifiedMonth AccessCount ExcelCount WordCount PowerPointCount
----------------------- ----------- ----------- ----------- ---------------
2010-01-01 00:00:00.000 0 27 1 0
2010-02-01 00:00:00.000 0 16 5 0
2010-03-01 00:00:00.000 0 86 1 0
2010-04-01 00:00:00.000 0 6 12 14
2010-05-01 00:00:00.000 0 1 0 0
2010-06-01 00:00:00.000 0 0 0 2
2010-07-01 00:00:00.000 0 1 0 0
2010-09-01 00:00:00.000 0 701 0 0
2010-10-01 00:00:00.000 0 34 0 0
2010-11-01 00:00:00.000 0 14 10 3
2011-03-01 00:00:00.000 0 9 706 704
2011-05-01 00:00:00.000 0 21 1 2
2011-06-01 00:00:00.000 0 8 3 0
2011-07-01 00:00:00.000 0 0 1 0
2011-08-01 00:00:00.000 0 6 0 0
2011-09-01 00:00:00.000 0 13 6 7
2011-10-01 00:00:00.000 0 1054 18 1
2011-11-01 00:00:00.000 0 0 2 0
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'FileTypes'. Scan count 1, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Files'. Scan count 14, logical reads 100, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Import'. Scan count 0, logical reads 28, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SourceFiles'. Scan count 0, logical reads 28, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Groups'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 47 ms, elapsed time = 46 ms.
May 5, 2012 at 4:57 am
Made a couple of errors in my previous posts.
DECLARE @ImportType INT = 7001, @ImportID CHAR(2) = '03',
@ImpDateFrom DATETIME = '2011-12-01', @ImpDateTo DATETIME = '2011-12-31',
@DateFrom DATETIME = '2010-01-01', @DateTo DATETIME = '2011-12-31',
@FileType VARCHAR(25) = '.XLS,.DOT,.DOC,.PPT,.MBD';
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
WITH CTE(n) AS(SELECT 1 UNION ALL SELECT 1),
CTE2(n) AS (SELECT 1 FROM CTE x, CTE y),
CTE3(n) AS (SELECT 1 FROM CTE2 x, CTE2 y),
CTE4(n) AS (SELECT 1 FROM CTE3 x, CTE3 y),
CTE5(n) AS (SELECT 0 UNION ALL
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM CTE4 x, CTE4 y)
SELECT
DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', DATEADD(MONTH,a.n,'1900-01-01')), 0) AS modifiedMonth,
SUM(ISNULL(IsAccess,0)) AS AccessCount,
SUM(ISNULL(IsExcel,0)) AS ExcelCount,
SUM(ISNULL(IsWord,0)) AS WordCount,
SUM(ISNULL(IsPowerPoint,0)) AS PowerPointCount
FROM CTE5 a
LEFT OUTER JOIN (SELECT
f.FileName,
f.LastModified,
DATEADD(MONTH, DATEDIFF(MONTH, 0, f.LastModified), 0) AS monthModified,
f.FileType,
sf.ImportType,
sf.ImportID,
ft.Application,
CASE WHEN ft.Application = 'Access' THEN 1 ELSE 0 END AS IsAccess,
CASE WHEN ft.Application = 'Excel' THEN 1 ELSE 0 END AS IsExcel,
CASE WHEN ft.Application = 'Word' THEN 1 ELSE 0 END AS IsWord,
CASE WHEN ft.Application = 'PowerPoint' THEN 1 ELSE 0 END AS IsPowerPoint
FROM dbo.Files f
INNER JOIN dbo.Groups g ON f.GroupID = g.GroupID
INNER JOIN dbo.SourceFiles sf ON g.BatchID = sf.BatchID
INNER JOIN dbo.Import i ON sf.ImportID = i.ImportID
INNER JOIN dbo.FileTypes ft ON f.FileType = ft.FileExtension
WHERE f.FileType IN (SELECT * FROM dbo.fn_Split(@FileType, ',')) AND
f.LastModified > dateadd(day, - 1, CONVERT(DATETIME, @DateFrom, 102)) AND
f.LastModified < dateadd(day, 1, CONVERT(DATETIME, @DateTo, 102)) AND
sf.ImportDate > dateadd(day, - 1, CONVERT(DATETIME, @ImpDateFrom, 102)) AND
sf.ImportDate < dateadd(day, 1, CONVERT(DATETIME, @ImpDateTo, 102)) AND
CHARINDEX(dbo.PADL(sf.ImportId, 2, '0'), @ImportID) > 0 AND
sf.ImportType = @ImportType) b ON DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', DATEADD(M,a.n,'1900-01-01')), 0) = b.monthModified
WHERE DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', DATEADD(MONTH,a.n,'1900-01-01')), 0) >= @DateFrom
AND DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', DATEADD(MONTH,a.n,'1900-01-01')), 0) <= @DateTo
GROUP BY DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', DATEADD(M,a.n,'1900-01-01')), 0)
ORDER BY DATEADD(MONTH, DATEDIFF(MONTH, '1900-01-01', DATEADD(M,a.n,'1900-01-01')), 0);
SET STATISTICS TIME OFF;
SET STATISTICS IO OFF;
The above produces: -
modifiedMonth AccessCount ExcelCount WordCount PowerPointCount
----------------------- ----------- ----------- ----------- ---------------
2010-01-01 00:00:00.000 0 27 1 0
2010-02-01 00:00:00.000 0 16 5 0
2010-03-01 00:00:00.000 0 86 1 0
2010-04-01 00:00:00.000 0 6 12 14
2010-05-01 00:00:00.000 0 1 0 0
2010-06-01 00:00:00.000 0 0 0 2
2010-07-01 00:00:00.000 0 1 0 0
2010-08-01 00:00:00.000 0 0 0 0
2010-09-01 00:00:00.000 0 701 0 0
2010-10-01 00:00:00.000 0 34 0 0
2010-11-01 00:00:00.000 0 14 10 3
2010-12-01 00:00:00.000 0 0 0 0
2011-01-01 00:00:00.000 0 0 0 0
2011-02-01 00:00:00.000 0 0 0 0
2011-03-01 00:00:00.000 0 9 706 704
2011-04-01 00:00:00.000 0 0 0 0
2011-05-01 00:00:00.000 0 21 1 2
2011-06-01 00:00:00.000 0 8 3 0
2011-07-01 00:00:00.000 0 0 1 0
2011-08-01 00:00:00.000 0 6 0 0
2011-09-01 00:00:00.000 0 13 6 7
2011-10-01 00:00:00.000 0 1054 18 1
2011-11-01 00:00:00.000 0 0 2 0
2011-12-01 00:00:00.000 0 0 0 0
(24 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'FileTypes'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Files'. Scan count 14, logical reads 100, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Groups'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Import'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SourceFiles'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#4CF5691D'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 78 ms, elapsed time = 69 ms.
May 7, 2012 at 9:31 am
Hi Cadavre,
Hey, this looks pretty good! Runs well too! Now I want to start refining it - so before I start screwing up the whole thing, let me describe the first situation I want to deal with...
When the user inputs his date range (Imported to/from, modified to/from) this is usually a very wide range and thus, plays no real part - except (!) when what they are trying to do (as sometimes happens) is get a "month view" so in those cases, those dates actually do some limiting.
But in most cases what we want to do is go to the oldest Files.LastModified date on file, and start from there, and run to "today". So in your example code, where you place '01-01-1900' I am wondering if I can use the oldest LastModified date AS THE START and today (whatever today is) as the end.
I am going to play with this a little bit today, but not sure how I get the oldest LastModified date on file BEFORE I start crunching - or indeed, if I can get it as part of the query as it is.
Any thoughts on that?
Viewing 3 posts - 76 through 77 (of 77 total)
You must be logged in to reply to this topic. Login to reply