Gap in Dates

  • This is the Query I started out with. It ran overnight and then I ran out of space.

    -- as_offer_dates has a column with all the days of the year from 2005 to 2020. This was used to check for the missing days in "as_all_division_coupons_sample

    Code:

    Select 'Redemption' As Redemption, ar3.*, 'Offers->' As Offers, ao.*

    Into dbo.as_Redemption_Offers_DayGap_Sample

    FROM dbo.as_offer_dates as ao

    Inner Join dbo.as_all_division_coupons_sample AS ar1

    ON ao.[date] - 1 = ar1.SCAN_DATE_CONVERT AND

    ao.DivNum = ar1.Kma_div_nbr AND

    ao.cpnnum = ar1.coupon_nbr

    Inner Join dbo.as_all_division_coupons_sample AS ar2

    ON ao.[date] + 1 = ar2.SCAN_DATE_CONVERT AND

    ao.DivNum = ar2.Kma_div_nbr AND

    ao.cpnnum = ar2.coupon_nbr

    Left Join dbo.as_all_division_coupons_sample As ar3

    ON ao.[date] = ar3.SCAN_DATE_CONVERT AND

    ao.DivNum = ar3.Kma_div_nbr AND

    ao.cpnnum = ar3.coupon_nbr

    WHERE ar3.STORE IS NULL

    This code above ran overnight and I had to modify my code to what I have below, which runs in 4 hours. I broke down my main table, by using the store column to filter by that and looping to find the missing days by store and then appending them to the results table.

    Declare @StoreNum int

    Declare @StoreCnt int

    Declare @Recid int

    Declare @RecCount int

    Set @Recid = (Select MAX(RECNO) as recno from dbo.as_AllStores )

    -- Get the Store number to use for subquery below.

    Set @StoreNum = (Select Store From dbo.as_AllStores Where RECNO = @Recid)

    IF OBJECT_ID(N'dbo.as_Redemption_Offers_DayGap', N'U') IS NOT NULL

    DROP TABLE as_Redemption_Offers_DayGap ;

    IF OBJECT_ID(N'tempdb..dbo.##as_OneStore', N'U') IS NOT NULL

    DROP TABLE dbo.##as_OneStore ;

    --Select *

    --Into dbo.##as_OneStore

    --From dbo.as_all_division_coupons_count

    --Where Store = @StoreNum

    Select 'Redemption' As Redemption, ar3.*, 'Offers->' As Offers, ao.*

    Into dbo.as_Redemption_Offers_DayGap

    FROM dbo.as_offer_dates as ao

    Inner Join (Select *

    From dbo.as_all_division_coupons_count2

    Where Store = @StoreNum) AS ar1

    --Inner Join dbo.##as_OneStore as ar1

    ON ao.[date] - 1 = ar1.SCAN_DATE_CONVERT AND

    ao.DivNum = ar1.Kma_div_nbr AND

    ao.cpnnum = ar1.coupon_nbr

    Inner Join (Select *

    From dbo.as_all_division_coupons_count2

    Where Store = @StoreNum) AS ar2

    --Inner Join dbo.##as_OneStore as ar2

    ON ao.[date] + 1 = ar2.SCAN_DATE_CONVERT AND

    ao.DivNum = ar2.Kma_div_nbr AND

    ao.cpnnum = ar2.coupon_nbr

    Left Join (Select *

    From dbo.as_all_division_coupons_count2

    Where Store = @StoreNum) As ar3

    --Left Join dbo.##as_OneStore as ar3

    ON ao.[date] = ar3.SCAN_DATE_CONVERT AND

    ao.DivNum = ar3.Kma_div_nbr AND

    ao.cpnnum = ar3.coupon_nbr

    WHERE ar3.STORE IS NULL

    Set @Recid = @Recid - 1

    --Drop Table dbo.##as_OneStore

    While @Recid > 0

    begin

    -- Get the next store number to query in the Join below.

    Set @StoreNum = (Select Store From dbo.as_AllStores Where RECNO = @Recid)

    -- Put into a temp table to reuse in bottom query

    Select *

    Into dbo.##as_OneStore

    From dbo.as_all_division_coupons_count2

    Where Store = @StoreNum

    Insert dbo.as_Redemption_Offers_DayGap

    Select 'Redemption', ar3.*, 'Offers->', ao.*

    FROM dbo.as_offer_dates as ao

    Inner Join dbo.##as_OneStore As ar1

    ON ao.[date] - 1 = ar1.SCAN_DATE_CONVERT AND

    ao.DivNum = ar1.Kma_div_nbr AND

    ao.cpnnum = ar1.coupon_nbr

    Inner Join dbo.##as_OneStore As ar2

    ON ao.[date] + 1 = ar2.SCAN_DATE_CONVERT AND

    ao.DivNum = ar2.Kma_div_nbr AND

    ao.cpnnum = ar2.coupon_nbr

    Left Join dbo.##as_OneStore As ar3

    ON ao.[date] = ar3.SCAN_DATE_CONVERT AND

    ao.DivNum = ar3.Kma_div_nbr AND

    ao.cpnnum = ar3.coupon_nbr

    WHERE ar3.STORE IS NULL

    Set @Recid = @Recid - 1

    Drop Table dbo.##as_OneStore

    End

    I am curious to see what other methods I can use and you two have given me a different way to look at this. I wanted to make sure I was using 1 of the best methods to accomplish this and to see if there was someone else that has done this before and how. Any adjustments/improvements to what I have above is appreciated.

  • First, anytime you have a Tally Table involved, the "% of Batch" numbers in both the estimated and actual execution plans are going to lie like a rug. In Grant's APress book on execution plans, he cites an example I gave him where the "% of Batch" for a two batch run came in at 0% and 100%. The exact reverse was true. Here's the example...

    /****************************************************************************************

    Purpose:

    This code demonstrates that the estimated and actual execution plans in SQL Server can

    be 100% INCORRECT and that the execution plan should only be relied on to provide hints

    as to what may be wrong with a query rather than an absolute indication. This code runs

    in SQL Server 2005 only.

    The code creates a temp table for 10 years worth of dates starting with 2000-01-01 using

    two different methods. The first method uses a recursive CTE and the second method uses

    a "Tally" table. The output of each method is directed to a "throw-away" variable to

    take display delays out of the picture.

    Please check both the actual and estimated execution plans and compare the % of batch.

    Please see the following article on how to build a Tally table and how they can be used

    to replace certain While Loops.

    http://www.sqlservercentral.com/articles/T-SQL/62867/

    ****************************************************************************************/

    SET NOCOUNT ON

    --=======================================================================================

    -- Recursive method shown by (Name with-held)

    --=======================================================================================

    PRINT '========== Recursive method =========='

    --===== Turn on some performance counters ===============================================

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    DECLARE @Bitbucket DATETIME --Holds display output so display times aren't measured.

    --===== Execute the code being tested ===================================================

    DECLARE @DateVal DATETIME

    SET @DateVal = '2000-01-01'

    ;with mycte as

    (

    select @DateVal AS DateVal

    union all

    select DateVal + 1

    from mycte

    where DateVal + 1 < DATEADD(yy, 10, @DateVal)

    )

    select @Bitbucket = d.dateval

    from mycte d

    OPTION (MAXRECURSION 0)

    --===== Turn off the performance counters and print a separator =========================

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',90)

    GO

    --=======================================================================================

    -- Tally table method by Jeff Moden

    --=======================================================================================

    PRINT '========== Tally table method =========='

    --===== Turn on some performance counters ===============================================

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    DECLARE @Bitbucket DATETIME --Holds display output so display times aren't measured.

    --===== Execute the code being tested ===================================================

    DECLARE @StartDate AS DATETIME

    SET @StartDate = '2000-01-01'

    SELECT TOP (DATEDIFF(dd,@StartDate,DATEADD(yy,10,@StartDate)))

    @Bitbucket = @StartDate-1+t.N

    FROM dbo.Tally t

    ORDER BY N

    --===== Turn off the performance counters and print a separator =========================

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',90)

    GO

    Second, the example that Wayne S is using for this problem is for a single range of ID's in a table with possibly huge gaps (think a 100 million row gap). I guarantee that the method Wayne S is using will smoke any Tally Table method for such a thing.

    Dates are a different story especially in a case like this where the actual number of dates per account are really quite small and won't ever get really big. In this case, the simplicity of outer joining to a virtual date table created by the Tally Table will normally outstripe the complex calculations of the other method. I say "normally" because someone like Paul White and his excellent handle on the internals of SQL Server could come along and smoke the Tally Table method.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • GrassHopper: What version of SQL are we dealing with?

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • I'm using SQL 2008.

  • Grasshopper, just FYI, this does a LOT more than gapcheck. Also this doesn't seem that horribly painful, cursor optimization techniques shouldn't be necessary here.

    Any chance we can see the DDL and indexes on as_all_division_coupons_sample and as_offer_dates? Your problem here looks to be index control more than query effectiveness. I'd love to see the estimated .SQLPLAN for the first query, too (at 24+ hours obviously don't try to get the actual).

    I've reformatted your original query for readability here:

    Select

    'Redemption' As Redemption,

    ar3.*,

    'Offers->' As Offers,

    ao.*

    Into

    dbo.as_Redemption_Offers_DayGap_Sample

    FROM

    dbo.as_offer_dates as ao

    Inner Join

    dbo.as_all_division_coupons_sample AS ar1

    ON

    ao.[date] - 1 = ar1.SCAN_DATE_CONVERT AND

    ao.DivNum = ar1.Kma_div_nbr AND

    ao.cpnnum = ar1.coupon_nbr

    Inner Join

    dbo.as_all_division_coupons_sample AS ar2

    ON

    ao.[date] + 1 = ar2.SCAN_DATE_CONVERT AND

    ao.DivNum = ar2.Kma_div_nbr AND

    ao.cpnnum = ar2.coupon_nbr

    Left Join

    dbo.as_all_division_coupons_sample As ar3

    ON

    ao.[date] = ar3.SCAN_DATE_CONVERT AND

    ao.DivNum = ar3.Kma_div_nbr AND

    ao.cpnnum = ar3.coupon_nbr

    WHERE

    ar3.STORE IS NULL

    This query isn't horribly painful, however there are a few oddities in it I'd like to discuss.

    For starters, ar3 is purposely locked into a NULL return set. This makes sense when looking for missing dates. However, your SELECT list returns ar3.*. Why is that? All it can return here are nulls. Also you want to make sure that as_all_division_coupons_sample.STORE is set to NOT NULL, or you could get false positives and leave out non-excepted items.

    How many rows do you figure this will return? This looks like it's going to end up pretty wide. Is the database pre-growthed to be able to accept a few million rows inserted in as_redemption_offers_daygap_sample? Is the LogFile as well?

    ao.date -1 and +1 I believe are non-SARGable. I have to go find one of Paul White's posts to confirm that in 2k5/8, as some of the rules changed. This could also be causing pain.

    Please definately post the Table schema and index definitions. You've already got what we concluded to be the fastest method (note, the Tally table is a shorthand equivalent to your as_offer_dates table), so we need to approach this from the other side, getting the indexes, structure, and joins cleaned up.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I did not have an index. I forgot to add it when i created that table. This is the index This is my index the execution plan said I should have and I have now added it and I'm running the query again. The table name is different because the other one was the same table but with a sample. I ran it before with the actual table which is the one I created the index on "as_all_division_coupons_count2".

    CREATE NONCLUSTERED INDEX [IDX_cnt] ON [dbo].[as_all_division_coupons_count2]

    (

    [Cnt] ASC

    )

    INCLUDE ( [Store],

    [KMA_DIV_NBR],

    [COUPON_NBR],

    [TRIG_UPC],

    [VNDR_RPT_CODE],

    [scan_date_convert]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

  • 1 hour 43 mins running time.

  • GrassHopper (11/18/2010)


    1 hour 43 mins running time.

    For which method???

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • 1 hour 43 mins for this to run:

    Declare @StoreNum int

    Declare @StoreCnt int

    Declare @Recid int

    Declare @RecCount int

    Set @Recid = (Select MAX(RECNO) as recno from dbo.as_AllStores )

    -- Get the Store number to use for subquery below.

    Set @StoreNum = (Select Store From dbo.as_AllStores Where RECNO = @Recid)

    IF OBJECT_ID(N'dbo.as_Redemption_Offers_DayGap', N'U') IS NOT NULL

    DROP TABLE as_Redemption_Offers_DayGap ;

    IF OBJECT_ID(N'tempdb..dbo.##as_OneStore', N'U') IS NOT NULL

    DROP TABLE dbo.##as_OneStore ;

    --Select *

    --Into dbo.##as_OneStore

    --From dbo.as_all_division_coupons_count

    --Where Store = @StoreNum

    Select 'Redemption' As Redemption, ar3.*, 'Offers->' As Offers, ao.*

    Into dbo.as_Redemption_Offers_DayGap

    FROM dbo.as_offer_dates as ao

    Inner Join (Select *

    From dbo.as_all_division_coupons_count2

    Where Store = @StoreNum) AS ar1

    -- Inner Join dbo.##as_OneStore as ar1

    ON ao.[date] - 1 = ar1.SCAN_DATE_CONVERT AND

    ao.DivNum = ar1.Kma_div_nbr AND

    ao.cpnnum = ar1.coupon_nbr

    Inner Join (Select *

    From dbo.as_all_division_coupons_count2

    Where Store = @StoreNum) AS ar2

    -- Inner Join dbo.##as_OneStore as ar2

    ON ao.[date] + 1 = ar2.SCAN_DATE_CONVERT AND

    ao.DivNum = ar2.Kma_div_nbr AND

    ao.cpnnum = ar2.coupon_nbr

    Left Join (Select *

    From dbo.as_all_division_coupons_count2

    Where Store = @StoreNum) As ar3

    -- Left Join dbo.##as_OneStore as ar3

    ON ao.[date] = ar3.SCAN_DATE_CONVERT AND

    ao.DivNum = ar3.Kma_div_nbr AND

    ao.cpnnum = ar3.coupon_nbr

    WHERE ar3.STORE IS NULL

    Set @Recid = @Recid - 1

    --Drop Table dbo.##as_OneStore

    While @Recid > 0

    begin

    -- Get the next store number to query in the Join below.

    Set @StoreNum = (Select Store From dbo.as_AllStores Where RECNO = @Recid)

    -- Put into a temp table to reuse in bottom query

    Select *

    Into dbo.##as_OneStore

    From dbo.as_all_division_coupons_count2

    Where Store = @StoreNum

    Insert dbo.as_Redemption_Offers_DayGap

    Select 'Redemption', ar3.*, 'Offers->', ao.*

    FROM dbo.as_offer_dates as ao

    Inner Join dbo.##as_OneStore As ar1

    ON ao.[date] - 1 = ar1.SCAN_DATE_CONVERT AND

    ao.DivNum = ar1.Kma_div_nbr AND

    ao.cpnnum = ar1.coupon_nbr

    Inner Join dbo.##as_OneStore As ar2

    ON ao.[date] + 1 = ar2.SCAN_DATE_CONVERT AND

    ao.DivNum = ar2.Kma_div_nbr AND

    ao.cpnnum = ar2.coupon_nbr

    Left Join dbo.##as_OneStore As ar3

    ON ao.[date] = ar3.SCAN_DATE_CONVERT AND

    ao.DivNum = ar3.Kma_div_nbr AND

    ao.cpnnum = ar3.coupon_nbr

    WHERE ar3.STORE IS NULL

    Set @Recid = @Recid - 1

    Drop Table dbo.##as_OneStore

    End

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply