Convert International Character-String Date Formats

  • Heh... and there was a response in the morning's email right next to this one that drives the point home for the bottom line above.  The "devil's in the data".  "Must look eye". 😀

    https://www.sqlservercentral.com/forums/topic/sql-newbie-question-about-primary-key#post-3844573

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    4. You really have to know the tools that you're working with.  As Jeffrey Williams stated in the first reply on this thread, the original problem could be avoided for dates if such data transmissions followed ISO standards for dates and times.  Even that's a problem in SQL Server, though. As Jeffrey points out, the two ISO standards for dates are YYYYMMDD and YYYY-MM-DD except that SQL Server silently uses the undocumented YYYY-DD-MM format instead of the latter format for many languages including but not limited to the French language.

    The ultimate bottom line is that you have to check the quality of any and all imported data, especially character based data and especially if it's multilingual in nature.  Shoot... I can't even get my car dealership to accurately quote the "out the door" price on lease buyout I'm trying to arrange via email.

    Just to be absolutely clear - if the data being sent is strictly date and you are using the DATE/DATETIME2/DATETIMEOFFSET data types and not DATETIME, then either YYYYMMDD or YYYY-MM-DD work and are not language dependent.  If you have a time portion and are using DATETIME to store the results from that column - then you must use either YYYYMMDD HH:MM:SS.nnn, YYYYMMDDTHH:MM:SS.nnn or YYYY-MM-DDTHH:MM:SS.nnn (include the 'T' separator between date and time).

    It can get even worse - since this is international data coming from various parts of the world (assuming that is the case based on the question) - then any time portions need to be identified by their time zone offset, and a decision needs to be made as to whether or not you store that data in UTC - and what data type to use for that data.  Do you use DATETIMEOFFSET and include the time zone offset in the date/time data or do you use DATETIME with a separate column for the time zone offset?

    And that is after you have built up the code to convert and validate the data appropriately, with processes to move 'bad' rows to another table - an application view on that table where a data analyst can review and determine how to correct the data, and some way for the data analyst to re-send the data through to be loaded with the updates.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Totally agreed... even when a time isn't included. 😀  There's this funny little line called the "International Date Line" that people seem to forget about.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 3 posts - 16 through 17 (of 17 total)

You must be logged in to reply to this topic. Login to reply