get around invalid date

  • I have a query like below, need to manipulate servicedatekey field as a date with a convert (is integer now) which should have date data such as '20180901' but has about 25K records which have a -1.  Get arithmetic overflow with this

    select count(servicedatekey) from FactClaimSummary where isdate(SERVICEDATEKEY) = 1 and convert(datetime,servicedatekey) > '2018/01/01'

  • Use NULLIF() to replace the -1 with a NULL.  You can then optionally provide a specific date to replace NULL values.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • As mentioned by Drew, NULLIF can work, but I'd suggest to go further and use TRY_CONVERT this will return NULL on all invalid dates instead of generating errors. If servicedatekey is an integer, you need to have an additional conversion made.

    SELECT COUNT(*) 
    FROM FactClaimSummary
    WHERE TRY_CONVERT(datetime, CONVERT( char(8), servicedatekey)) > '20180101'

    Note that I changed the date format to something that won't depend on language or date settings.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Don't convert a column to a different data type unless you absolutely can't avoid it (sargability and all).  For your situation, this should handle both dates and the -1 properly.  I figured that >= made more sense than >, but change it back if you really want to skip just Jan 1 of 2018.

    SELECT COUNT(servicedatekey) AS servicedate_count
    FROM dbo.FactClaimSummary
    WHERE servicedatekey >= 20180101 /* or '20180101' */

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • The only display format allowed in ANSI/ISO standard SQL is based on ISO standard 8601. This is "yyyy-mm-dd" and after the metric system, it's probably the most commonly used ISO standard on earth. Instead of trying to patch it up when you get to the database, can you scrub your data in the input tier of your tiered architecture? This will let you do a much better job than trying to kludge it at the last minute before you go live.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • just started here as dba so change probably not it is 133 million rows index seemed to help will look at cleanup just in case  thanks for help

  • Actually, the format YYYYMMDD is perfectly acceptable under ISO.  I don't know why Celko insists on repeating his false claim over and over about YYYY-MM-DD despite being told that it's false.

    For display of dates, you have your choice.  But for entering dates, YYYYMMDD should be your only method.  It's 100% unambiguous no matter what the language and/or date settings are.  But YYYY-MM-DD can be misinterpreted, ISO or no.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • >> Actually, the format YYYYMMDD is perfectly acceptable under ISO. I don't know why Celko insists on repeating his false claim over and over about YYYY-MM-DD despite being told that it's false.<<

    The reason I insist on this is that I was serving on the ANSI standards committee when we put it in the language 🙂 The "YYYY-MM-DD" was picked from several options in the ISO 8601 standard for the following reasons:

    1) it is an ISO standard and X3H2 has always supported ISO standards. We don't like local dialects.

    2) these display strings sort correctly in temporal order. They are not ambiguous.

    3) They are not language dependent (I really don't like Oracle's default format for that reason).

    4) these display strings cannot be confused with numerics because of the dashes. This is a principal reason we rejected the "YYYYMMDD" option. Why do you think it is not ambigous?

    5) The regular expression for it is very simple.

    6) by allowing one and only one ANSI/ISO standard display format, we make parsing and storing the data much easier without having to worry about support for dozens of national formats, ordinal date format, week within year format, etc.

    The principal architect of the temporal extensions was Rick Snodgrass, at the University of Arizona. This was his area of academic specialization and he put out a book on temporal queries and SQL is available as a PDF download from the University of Arizona. It is truly a classic and should be on the bookshelf of every working SQL programmer.

    I found that when you build a calendar table it's very handy to have the ISO 8601 ordinal date and the ISO day-within-a-week format. The second format is very popular in the Nordic countries.

    My personal opinion is that we should have required the "T" separator between the date and the time parts of a timestamp. This would make a timestamp a continuous string without any worries about embedded white space like we have now.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • jcelko212 32090 wrote:

    3) They are not language dependent

    This is demonstrably FALSE.  You have been told repeatedly that this is false.  If you don't like it take it up with Microsoft.  Until then, stop spreading this lie.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • 2) these display strings sort correctly in temporal order. They are not ambiguous.

    As I stated, In the real world dates are sometimes given as YYYY-DD-MM for certain regions of the world, ISO or not.  We have offices around the world, so using that format is ambiguous.  YYYYMMDD is not, because a date is never shown as YYYYDDMM, never.  It's that simple.

    As to ISO rules:

    iso.org

    https://www.iso.org/obp/ui#iso:std:iso:8601:-1:ed-1:v1:en

    3.1.3.1

    date and time expression

    '20180801' is a date and time expression that conforms to this representation.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • jcelko212 32090 wrote:

    >> Actually, the format YYYYMMDD is perfectly acceptable under ISO. I don't know why Celko insists on repeating his false claim over and over about YYYY-MM-DD despite being told that it's false.<<

    The reason I insist on this is that I was serving on the ANSI standards committee when we put it in the language 🙂 The "YYYY-MM-DD" was picked from several options in the ISO 8601 standard for the following reasons:

    1) it is an ISO standard and X3H2 has always supported ISO standards. We don't like local dialects.

    2) these display strings sort correctly in temporal order. They are not ambiguous.

    3) They are not language dependent (I really don't like Oracle's default format for that reason).

    4) these display strings cannot be confused with numerics because of the dashes. This is a principal reason we rejected the "YYYYMMDD" option. Why do you think it is not ambigous?

    5) The regular expression for it is very simple.

    6) by allowing one and only one ANSI/ISO standard display format, we make parsing and storing the data much easier without having to worry about support for dozens of national formats, ordinal date format, week within year format, etc.

    The principal architect of the temporal extensions was Rick Snodgrass, at the University of Arizona. This was his area of academic specialization and he put out a book on temporal queries and SQL is available as a PDF download from the University of Arizona. It is truly a classic and should be on the bookshelf of every working SQL programmer.

    I found that when you build a calendar table it's very handy to have the ISO 8601 ordinal date and the ISO day-within-a-week format. The second format is very popular in the Nordic countries.

    My personal opinion is that we should have required the "T" separator between the date and the time parts of a timestamp. This would make a timestamp a continuous string without any worries about embedded white space like we have now.

    I don't care if you wrote the standard or not or what you think you remember.  Go read the standard as it is now.  YYYYMMDD is the primary and YYYY-MM-DD is an "alternate" format that is "acceptable".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Why would I take up an ANSI/ISO standard with Microsoft? I don't think you understand how international standards are set. Vendors can take part in the committees, but they have only a vote and not complete control. If you want to demonstrate this, then quote the Standards, not Microsoft.

    “[H]e is a barbarian, and thinks that the customs of his tribe and island are the laws of nature.”

    ? George Bernard Shaw, Caesar and Cleopatra

    For example, MySQL is one example of a popular SQL that uses only "yyyy-mm-dd".

    It could be worse; Oracle uses dates like this:"1-JAN-2020" which is very obviously dependent on Roman month names that we use in the West. I'm not sure if they have anything for Slavic month names that are used by the Czechs. (leden, únor, brezen, duben, kveten, cerven, cervenec, srpen, zárí, ríjen, listopad, prosinec). As far as I know, nobody ever used the Roman numerals for months that were part of the NATO standards when France was a member.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • jcelko212 32090 wrote:

    Why would I take up an ANSI/ISO standard with Microsoft? I don't think you understand how international standards are set. Vendors can take part in the committees, but they have only a vote and not complete control. If you want to demonstrate this, then quote the Standards, not Microsoft.

    “[H]e is a barbarian, and thinks that the customs of his tribe and island are the laws of nature.”

    ? George Bernard Shaw, Caesar and Cleopatra

    For example, MySQL is one example of a popular SQL that uses only "yyyy-mm-dd".

    It could be worse; Oracle uses dates like this:"1-JAN-2020" which is very obviously dependent on Roman month names that we use in the West. I'm not sure if they have anything for Slavic month names that are used by the Czechs. (leden, únor, brezen, duben, kveten, cerven, cervenec, srpen, zárí, ríjen, listopad, prosinec). As far as I know, nobody ever used the Roman numerals for months that were part of the NATO standards when France was a member.

    I've quoted the exact standard to you several times, Joe.  And it was the ISO Standard (8601), not the MS Standard.  The section that you should be looking for is "4.1.2.2 Complete representations".  I obviously cannot post the section here because it's copyrighted data but it clearly states that YYYYMMDD is the basic format and that YYYY-MM-DD is the extended format and that only those two formats are allowed when a full representation of the calendar date is required.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Joe, You would take up an ANSI/ISO standard with Microsoft, because they are not complying with the standard and you want them to comply.  Until they do, stop pushing the broken version of the date format, especially since this forum is specifically for the MS version of SQL and not an ANSI/ISO version of SQL.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Actually Microsoft is been trying very hard to move away from the old Sybase code Museum. This is why you write infixed join operators today, instead of the old extended equality += notation. Likewise, this is why BIT became a numeric data type instead of staying a binary computer science type of BIT. And if my suggestions are broken, are so bad, why do the recent DATE and DATETIME2(n) data types default to it? Why do we have a MERGE statement today?

    The problem that products have is that legacy code (more appropriately called "family curse") has to be supported that major changes come slowly. But they do come.

    Just before the infixed join operators became part of SQL Server I did some code with the old infixed operators for a client. But I also put the ANSI/ISO standard outer joins in comments. When my client upgraded, all they had to do was uncommented and test. I lost the potential for several thousand dollars worth of extra consulting work by not doing job secure programming, but I was their hero. It took them about a week to migrate.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply