wish to add rows having NULL values as placeholders for "missing" dates

  • I have a table that contains stuff happening. There's a datetime stamp column to show on what date the stuff happened. There are days where nothing happens. I want to select from this table and add rows to the result set on the fly, for those days on which nothing happened. Reason: I've got to pull it into excel and need a placeholder for those days.

    For the purpose of illustration, I've created a table containing the start datetimes of all backups that have been kicked off for a server called Peaches. This table also contains the name of the server, the size of backup, and the associated jobid. On some days there are no backups eg. 8/05/2013 through 8/13/2013, BUT I still want to generate a row for those days as a placeholder.

    Can you provide guidance on how to do?

    Here's DDL.

    CREATE TABLE [dbo].[CannedBackupJobs](

    [jobid] [int] NULL,

    [SizeTB] [float] NULL,

    [StartTime] [datetime] NULL,

    [ServerName] [varchar](20) NULL

    )

    insert into [dbo].[CannedBackupJobs]

    values

    (83, 365.226943141887,'2013-08-04 03:20:30.777', 'Peaches'),

    (83, 408.830221699759, '2013-08-14 18:26:53.220', 'Peaches'),

    (83, 391.654500133873, '2013-08-15 15:44:34.977', 'Peaches'),

    (83, 397.063717616127, '2013-08-20 02:10:57.747', 'Peaches'),

    (83, 353.803773579467, '2013-08-24 05:56:26.090', 'Peaches');

    --Quote me

  • Use a calendar table

    http://www.sqlservercentral.com/articles/T-SQL/70482/

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • OK. I've got a dates table using.

    CREATE TABLE _Dates (

    d DATE,

    PRIMARY KEY (d)

    )

    DECLARE @dIncr DATE = '2000-01-01'

    DECLARE @dEnd DATE = '2100-01-01'

    WHILE ( @dIncr < @dEnd )

    BEGIN

    INSERT INTO _Dates (d) VALUES( @dIncr )

    SELECT @dIncr = DATEADD(DAY, 1, @dIncr )

    END

    I've tried to join to it with left, right, and full outer joins. Can't get the placeholder value:

    select * from cannedbackupjobs dbj

    full outer join _dates d on d.d = CONVERT(VARCHAR(10), dbj.starttime, 120)

    where dbj.starttime >= (select min(starttime) from cannedbackupjobs);

    How to join to the _dates table so that it pulls in the date, even when no value in the CannedBackupJobs table?

    --Quote me

  • Try this

    select * from _dates d

    left outer join cannedbackupjobs dbj on d.d = CONVERT(VARCHAR(10), dbj.starttime, 120)

    where d.d between (select min(starttime) from cannedbackupjobs) and (select max(starttime) from cannedbackupjobs);

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • FROM CalenderTable LEFT OUTER JOIN TableWithOtherDatesInIt

    That's the starting point. Your where clause was filtering out any nulls and essentially turning the join into an inner join.

    You don't want to be doing any messy date to string conversions, they're both date columns, so just compare directly. CAST as DATE if one contains times.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • YIKES, I see the error. The where should be d.d>=

    and not dbj.starttime >=

    This works:

    select * from cannedbackupjobs dbj

    full outer join _dates d on d.d = CONVERT(VARCHAR(10), dbj.starttime, 120)

    where d.d >= (select min(starttime) from cannedbackupjobs);

    I'll look at ridding the calculations in the where clause for performance, now. I understand I don't want to be using functions against values that I am filtering for. I'll use a better calendar lookup table. Thanks!

    --Quote me

  • polkadot (10/2/2013)


    YIKES, I see the error. The where should be d.d>=

    and not dbj.starttime >=

    This works:

    select * from cannedbackupjobs dbj

    full outer join _dates d on d.d = CONVERT(VARCHAR(10), dbj.starttime, 120)

    where d.d >= (select min(starttime) from cannedbackupjobs);

    I'll look at ridding the calculations in the where clause for performance, now. I understand I don't want to be using functions against values that I am filtering for. I'll use a better calendar lookup table. Thanks!

    Your best bet is to follow Gail's suggestion:

    FROM CalenderTable LEFT OUTER JOIN TableWithOtherDatesInIt

    Avoid referencing TableWithOtherDatesInIt in the WHERE clause - you will end up with an INNER join. Build filters into the JOIN condition instead.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Use LEFT OUTER JOIN, but don't cast datetimes in any data table to dates to do comparisons; instead, use a range check.

    Here's a sample join for the tables you've described:

    SELECT ...

    FROM dbo._dates d

    INNER JOIN (

    SELECT

    DATEADD(DAY, DATEDIFF(DAY, 0, MIN(starttime)), 0) AS start_date,

    DATEADD(DAY, DATEDIFF(DAY, 0, MAX(starttime)) + 1, 0) AS end_date

    FROM dbo.cannedbackupjobs

    ) AS date_range ON

    d.date >= date_range.start_date AND

    d.date < date_range.end_date

    LEFT OUTER JOIN cannedbackupjobs dbj ON

    dbj.starttime >= d.date AND

    dbj.starttime < DATEADD(DAY, 1, d.date)

    --use the WHERE below instead of the INNER JOIN above if you want to

    --report on just a specific date range that you specify

    --WHERE d.date >= @start_date AND d.date < @end_date

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Thanks Scott, and everyone for replies & link.

    --Quote me

  • When I join to the Dates table, I am now successful in creating placeholders for those dates missing. The values for all the columns from the CannedBackupJobs table are NULL, wherever there is no data for a specific date.....

    ..............which would be fine if there were ever only one servername.

    My fault for not providing/ or thinking about the additional layer of the problem. The DDL I provided should have contained rows with dates for backups that were done for more than one server eg Peaches & Pears.

    New DDL INSERT statement

    insert into [dbo].[CannedBackupJobs]

    values

    (83, 365.226943141887,'2013-08-04 03:20:30.777', 'Peaches'),

    (83, 408.830221699759, '2013-08-14 18:26:53.220', 'Peaches'),

    (83, 391.654500133873, '2013-08-15 15:44:34.977', 'Peaches'),

    (83, 397.063717616127, '2013-08-20 02:10:57.747', 'Peaches'),

    (83, 353.803773579467, '2013-08-24 05:56:26.090', 'Peaches'),

    (100, 533.226943141887,'2013-08-07 03:20:30.777', 'Pears'),

    (100, 788.830221699759, '2013-08-09 18:26:53.220', 'Pears'),

    (100, 351.654500133873, '2013-09-07 15:44:34.977', 'Pears'),

    (100, 347.063717616127, '2013-09-09 02:10:57.747', 'Pears'),

    (100, 663.803773579467, '2013-10-09 05:56:26.090', 'Pears');

    Each server has it's own unique history of backups, against a continuum of time, sometimes sharing dates with other servers, on which no backups occurred, sometimes not. On those days where neither server was backed up, I need the same date placeholder specifically for Peaches and another for Pears.

    Do I do this with a bunch of seperate select statements (with WHERE ServerName = Peaches and with WHERE ServerName = Pears) run against the _Dates table with a UNION ALL to recombine them? Or is there a better way?

    Thanks much in advance!

    --Quote me

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply