November 17, 2010 at 6:55 am
This is the Query I started out with. It ran overnight and then I ran out of space.
-- as_offer_dates has a column with all the days of the year from 2005 to 2020. This was used to check for the missing days in "as_all_division_coupons_sample
Code:
Select 'Redemption' As Redemption, ar3.*, 'Offers->' As Offers, ao.*
Into dbo.as_Redemption_Offers_DayGap_Sample
FROM dbo.as_offer_dates as ao
Inner Join dbo.as_all_division_coupons_sample AS ar1
ON ao.[date] - 1 = ar1.SCAN_DATE_CONVERT AND
ao.DivNum = ar1.Kma_div_nbr AND
ao.cpnnum = ar1.coupon_nbr
Inner Join dbo.as_all_division_coupons_sample AS ar2
ON ao.[date] + 1 = ar2.SCAN_DATE_CONVERT AND
ao.DivNum = ar2.Kma_div_nbr AND
ao.cpnnum = ar2.coupon_nbr
Left Join dbo.as_all_division_coupons_sample As ar3
ON ao.[date] = ar3.SCAN_DATE_CONVERT AND
ao.DivNum = ar3.Kma_div_nbr AND
ao.cpnnum = ar3.coupon_nbr
WHERE ar3.STORE IS NULL
This code above ran overnight and I had to modify my code to what I have below, which runs in 4 hours. I broke down my main table, by using the store column to filter by that and looping to find the missing days by store and then appending them to the results table.
Declare @StoreNum int
Declare @StoreCnt int
Declare @Recid int
Declare @RecCount int
Set @Recid = (Select MAX(RECNO) as recno from dbo.as_AllStores )
-- Get the Store number to use for subquery below.
Set @StoreNum = (Select Store From dbo.as_AllStores Where RECNO = @Recid)
IF OBJECT_ID(N'dbo.as_Redemption_Offers_DayGap', N'U') IS NOT NULL
DROP TABLE as_Redemption_Offers_DayGap ;
IF OBJECT_ID(N'tempdb..dbo.##as_OneStore', N'U') IS NOT NULL
DROP TABLE dbo.##as_OneStore ;
--Select *
--Into dbo.##as_OneStore
--From dbo.as_all_division_coupons_count
--Where Store = @StoreNum
Select 'Redemption' As Redemption, ar3.*, 'Offers->' As Offers, ao.*
Into dbo.as_Redemption_Offers_DayGap
FROM dbo.as_offer_dates as ao
Inner Join (Select *
From dbo.as_all_division_coupons_count2
Where Store = @StoreNum) AS ar1
--Inner Join dbo.##as_OneStore as ar1
ON ao.[date] - 1 = ar1.SCAN_DATE_CONVERT AND
ao.DivNum = ar1.Kma_div_nbr AND
ao.cpnnum = ar1.coupon_nbr
Inner Join (Select *
From dbo.as_all_division_coupons_count2
Where Store = @StoreNum) AS ar2
--Inner Join dbo.##as_OneStore as ar2
ON ao.[date] + 1 = ar2.SCAN_DATE_CONVERT AND
ao.DivNum = ar2.Kma_div_nbr AND
ao.cpnnum = ar2.coupon_nbr
Left Join (Select *
From dbo.as_all_division_coupons_count2
Where Store = @StoreNum) As ar3
--Left Join dbo.##as_OneStore as ar3
ON ao.[date] = ar3.SCAN_DATE_CONVERT AND
ao.DivNum = ar3.Kma_div_nbr AND
ao.cpnnum = ar3.coupon_nbr
WHERE ar3.STORE IS NULL
Set @Recid = @Recid - 1
--Drop Table dbo.##as_OneStore
While @Recid > 0
begin
-- Get the next store number to query in the Join below.
Set @StoreNum = (Select Store From dbo.as_AllStores Where RECNO = @Recid)
-- Put into a temp table to reuse in bottom query
Select *
Into dbo.##as_OneStore
From dbo.as_all_division_coupons_count2
Where Store = @StoreNum
Insert dbo.as_Redemption_Offers_DayGap
Select 'Redemption', ar3.*, 'Offers->', ao.*
FROM dbo.as_offer_dates as ao
Inner Join dbo.##as_OneStore As ar1
ON ao.[date] - 1 = ar1.SCAN_DATE_CONVERT AND
ao.DivNum = ar1.Kma_div_nbr AND
ao.cpnnum = ar1.coupon_nbr
Inner Join dbo.##as_OneStore As ar2
ON ao.[date] + 1 = ar2.SCAN_DATE_CONVERT AND
ao.DivNum = ar2.Kma_div_nbr AND
ao.cpnnum = ar2.coupon_nbr
Left Join dbo.##as_OneStore As ar3
ON ao.[date] = ar3.SCAN_DATE_CONVERT AND
ao.DivNum = ar3.Kma_div_nbr AND
ao.cpnnum = ar3.coupon_nbr
WHERE ar3.STORE IS NULL
Set @Recid = @Recid - 1
Drop Table dbo.##as_OneStore
End
I am curious to see what other methods I can use and you two have given me a different way to look at this. I wanted to make sure I was using 1 of the best methods to accomplish this and to see if there was someone else that has done this before and how. Any adjustments/improvements to what I have above is appreciated.
November 17, 2010 at 7:15 am
First, anytime you have a Tally Table involved, the "% of Batch" numbers in both the estimated and actual execution plans are going to lie like a rug. In Grant's APress book on execution plans, he cites an example I gave him where the "% of Batch" for a two batch run came in at 0% and 100%. The exact reverse was true. Here's the example...
/****************************************************************************************
Purpose:
This code demonstrates that the estimated and actual execution plans in SQL Server can
be 100% INCORRECT and that the execution plan should only be relied on to provide hints
as to what may be wrong with a query rather than an absolute indication. This code runs
in SQL Server 2005 only.
The code creates a temp table for 10 years worth of dates starting with 2000-01-01 using
two different methods. The first method uses a recursive CTE and the second method uses
a "Tally" table. The output of each method is directed to a "throw-away" variable to
take display delays out of the picture.
Please check both the actual and estimated execution plans and compare the % of batch.
Please see the following article on how to build a Tally table and how they can be used
to replace certain While Loops.
http://www.sqlservercentral.com/articles/T-SQL/62867/
****************************************************************************************/
SET NOCOUNT ON
--=======================================================================================
-- Recursive method shown by (Name with-held)
--=======================================================================================
PRINT '========== Recursive method =========='
--===== Turn on some performance counters ===============================================
SET STATISTICS IO ON
SET STATISTICS TIME ON
DECLARE @Bitbucket DATETIME --Holds display output so display times aren't measured.
--===== Execute the code being tested ===================================================
DECLARE @DateVal DATETIME
SET @DateVal = '2000-01-01'
;with mycte as
(
select @DateVal AS DateVal
union all
select DateVal + 1
from mycte
where DateVal + 1 < DATEADD(yy, 10, @DateVal)
)
select @Bitbucket = d.dateval
from mycte d
OPTION (MAXRECURSION 0)
--===== Turn off the performance counters and print a separator =========================
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',90)
GO
--=======================================================================================
-- Tally table method by Jeff Moden
--=======================================================================================
PRINT '========== Tally table method =========='
--===== Turn on some performance counters ===============================================
SET STATISTICS IO ON
SET STATISTICS TIME ON
DECLARE @Bitbucket DATETIME --Holds display output so display times aren't measured.
--===== Execute the code being tested ===================================================
DECLARE @StartDate AS DATETIME
SET @StartDate = '2000-01-01'
SELECT TOP (DATEDIFF(dd,@StartDate,DATEADD(yy,10,@StartDate)))
@Bitbucket = @StartDate-1+t.N
FROM dbo.Tally t
ORDER BY N
--===== Turn off the performance counters and print a separator =========================
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',90)
GO
Second, the example that Wayne S is using for this problem is for a single range of ID's in a table with possibly huge gaps (think a 100 million row gap). I guarantee that the method Wayne S is using will smoke any Tally Table method for such a thing.
Dates are a different story especially in a case like this where the actual number of dates per account are really quite small and won't ever get really big. In this case, the simplicity of outer joining to a virtual date table created by the Tally Table will normally outstripe the complex calculations of the other method. I say "normally" because someone like Paul White and his excellent handle on the internals of SQL Server could come along and smoke the Tally Table method.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 17, 2010 at 7:26 am
GrassHopper: What version of SQL are we dealing with?
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
November 17, 2010 at 4:13 pm
I'm using SQL 2008.
November 17, 2010 at 4:56 pm
Grasshopper, just FYI, this does a LOT more than gapcheck. Also this doesn't seem that horribly painful, cursor optimization techniques shouldn't be necessary here.
Any chance we can see the DDL and indexes on as_all_division_coupons_sample and as_offer_dates? Your problem here looks to be index control more than query effectiveness. I'd love to see the estimated .SQLPLAN for the first query, too (at 24+ hours obviously don't try to get the actual).
I've reformatted your original query for readability here:
Select
'Redemption' As Redemption,
ar3.*,
'Offers->' As Offers,
ao.*
Into
dbo.as_Redemption_Offers_DayGap_Sample
FROM
dbo.as_offer_dates as ao
Inner Join
dbo.as_all_division_coupons_sample AS ar1
ON
ao.[date] - 1 = ar1.SCAN_DATE_CONVERT AND
ao.DivNum = ar1.Kma_div_nbr AND
ao.cpnnum = ar1.coupon_nbr
Inner Join
dbo.as_all_division_coupons_sample AS ar2
ON
ao.[date] + 1 = ar2.SCAN_DATE_CONVERT AND
ao.DivNum = ar2.Kma_div_nbr AND
ao.cpnnum = ar2.coupon_nbr
Left Join
dbo.as_all_division_coupons_sample As ar3
ON
ao.[date] = ar3.SCAN_DATE_CONVERT AND
ao.DivNum = ar3.Kma_div_nbr AND
ao.cpnnum = ar3.coupon_nbr
WHERE
ar3.STORE IS NULL
This query isn't horribly painful, however there are a few oddities in it I'd like to discuss.
For starters, ar3 is purposely locked into a NULL return set. This makes sense when looking for missing dates. However, your SELECT list returns ar3.*. Why is that? All it can return here are nulls. Also you want to make sure that as_all_division_coupons_sample.STORE is set to NOT NULL, or you could get false positives and leave out non-excepted items.
How many rows do you figure this will return? This looks like it's going to end up pretty wide. Is the database pre-growthed to be able to accept a few million rows inserted in as_redemption_offers_daygap_sample? Is the LogFile as well?
ao.date -1 and +1 I believe are non-SARGable. I have to go find one of Paul White's posts to confirm that in 2k5/8, as some of the rules changed. This could also be causing pain.
Please definately post the Table schema and index definitions. You've already got what we concluded to be the fastest method (note, the Tally table is a shorthand equivalent to your as_offer_dates table), so we need to approach this from the other side, getting the indexes, structure, and joins cleaned up.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
November 18, 2010 at 1:21 pm
I did not have an index. I forgot to add it when i created that table. This is the index This is my index the execution plan said I should have and I have now added it and I'm running the query again. The table name is different because the other one was the same table but with a sample. I ran it before with the actual table which is the one I created the index on "as_all_division_coupons_count2".
CREATE NONCLUSTERED INDEX [IDX_cnt] ON [dbo].[as_all_division_coupons_count2]
(
[Cnt] ASC
)
INCLUDE ( [Store],
[KMA_DIV_NBR],
[COUPON_NBR],
[TRIG_UPC],
[VNDR_RPT_CODE],
[scan_date_convert]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
November 18, 2010 at 3:20 pm
1 hour 43 mins running time.
November 18, 2010 at 7:36 pm
GrassHopper (11/18/2010)
1 hour 43 mins running time.
For which method???
--Jeff Moden
Change is inevitable... Change for the better is not.
November 18, 2010 at 10:18 pm
1 hour 43 mins for this to run:
Declare @StoreNum int
Declare @StoreCnt int
Declare @Recid int
Declare @RecCount int
Set @Recid = (Select MAX(RECNO) as recno from dbo.as_AllStores )
-- Get the Store number to use for subquery below.
Set @StoreNum = (Select Store From dbo.as_AllStores Where RECNO = @Recid)
IF OBJECT_ID(N'dbo.as_Redemption_Offers_DayGap', N'U') IS NOT NULL
DROP TABLE as_Redemption_Offers_DayGap ;
IF OBJECT_ID(N'tempdb..dbo.##as_OneStore', N'U') IS NOT NULL
DROP TABLE dbo.##as_OneStore ;
--Select *
--Into dbo.##as_OneStore
--From dbo.as_all_division_coupons_count
--Where Store = @StoreNum
Select 'Redemption' As Redemption, ar3.*, 'Offers->' As Offers, ao.*
Into dbo.as_Redemption_Offers_DayGap
FROM dbo.as_offer_dates as ao
Inner Join (Select *
From dbo.as_all_division_coupons_count2
Where Store = @StoreNum) AS ar1
-- Inner Join dbo.##as_OneStore as ar1
ON ao.[date] - 1 = ar1.SCAN_DATE_CONVERT AND
ao.DivNum = ar1.Kma_div_nbr AND
ao.cpnnum = ar1.coupon_nbr
Inner Join (Select *
From dbo.as_all_division_coupons_count2
Where Store = @StoreNum) AS ar2
-- Inner Join dbo.##as_OneStore as ar2
ON ao.[date] + 1 = ar2.SCAN_DATE_CONVERT AND
ao.DivNum = ar2.Kma_div_nbr AND
ao.cpnnum = ar2.coupon_nbr
Left Join (Select *
From dbo.as_all_division_coupons_count2
Where Store = @StoreNum) As ar3
-- Left Join dbo.##as_OneStore as ar3
ON ao.[date] = ar3.SCAN_DATE_CONVERT AND
ao.DivNum = ar3.Kma_div_nbr AND
ao.cpnnum = ar3.coupon_nbr
WHERE ar3.STORE IS NULL
Set @Recid = @Recid - 1
--Drop Table dbo.##as_OneStore
While @Recid > 0
begin
-- Get the next store number to query in the Join below.
Set @StoreNum = (Select Store From dbo.as_AllStores Where RECNO = @Recid)
-- Put into a temp table to reuse in bottom query
Select *
Into dbo.##as_OneStore
From dbo.as_all_division_coupons_count2
Where Store = @StoreNum
Insert dbo.as_Redemption_Offers_DayGap
Select 'Redemption', ar3.*, 'Offers->', ao.*
FROM dbo.as_offer_dates as ao
Inner Join dbo.##as_OneStore As ar1
ON ao.[date] - 1 = ar1.SCAN_DATE_CONVERT AND
ao.DivNum = ar1.Kma_div_nbr AND
ao.cpnnum = ar1.coupon_nbr
Inner Join dbo.##as_OneStore As ar2
ON ao.[date] + 1 = ar2.SCAN_DATE_CONVERT AND
ao.DivNum = ar2.Kma_div_nbr AND
ao.cpnnum = ar2.coupon_nbr
Left Join dbo.##as_OneStore As ar3
ON ao.[date] = ar3.SCAN_DATE_CONVERT AND
ao.DivNum = ar3.Kma_div_nbr AND
ao.cpnnum = ar3.coupon_nbr
WHERE ar3.STORE IS NULL
Set @Recid = @Recid - 1
Drop Table dbo.##as_OneStore
End
Viewing 9 posts - 16 through 23 (of 23 total)
You must be logged in to reply to this topic. Login to reply