Help! Importing datetime from a text file...wont work at all!

  • OK, here's the situation...

    I'm importing data from a mainframe dataset that has been ftp'ed to a PC.  Basically, we need to upload this data into a SQL Server 2000 DB.  In order to work it in pieces as a 'rough draft', I have a few DTS Packages...one loads the data into a table with the correct column names (all columns are nvarchar, except for 3 large ones which are ntext), one concatenates the separate date and time columns into datetimes, and the final one should convert those datetimes from nvarchar to datetime.

    I've tried using a datetime transformation, datetime and sdatetime datatypes on the destination columns, CDate in the VBScript transformation, every datetime format you can think of...and it just will NOT work!  I always get a 'type mismatch' error on the first datetime column.  I've even tried rolling it all together into one VBScript, but that didn't work either.

    Does ANYone have any ideas, any "'gotcha's" that I need to know about?  Am I providing enough info?

    All help will be extremely appreciated!

    Chris

  • Hi Chris - A few questions first.  Going back to the beginning - you say you get the mainframe ftp'd to you.  Are you sure the data in the file is in ASCII format?  If this is IBM m/f data, for example, some times dates and times are stored as zoned-decimal - and that is something you definitely don't want to deal with.  Can you look at the ftp'd file(s) in a text editor so you can verify the date and time values are just vanilla ASCII. 

    Next, how does the mainframe system handle missing dates/times.  Many mainframe systems do not support null values - so instead, they plug a magic number into the date and time fields (all 9's or all zeroes).  This does not convert well to SQL datetime columns.

    Next, what kind of task are you using in DTS??  A VBScript task?  A transform data task?  If you are using a transform data task you will probably have to write script code to format the data the way you want. 

    Hope these ideas help.  Good luck.


    Mike at the Mill

  • Mike, thanks for the input...

    The data is, indeed, ASCII...actually, all of it imports fine as nvarchar.  As for the bad and null data, that doesn't seem to be an issue either (the missing values are nulls and there are no weird inserts going on).

    Now, as to the DTS task, well, I've tried using VBScript in an ActiveX transform, a datetime transformation, and nearly anything else I could think of.  I've had no problem reformatting the data using different string manipulations, but it will not accept the data into the table no matter what format I try (I've also tried changing the column on the destination from smalldatetime to datetime and back again...no dice).  It's a conundrum.

    Thanks!

    Chris

  • Chris - Very well.  In the ftp file - how are the fields coming in?  Fixed format or delimiter (e.g., comma) separated?  Do you have both numeric values AND text values in the same record?  Are the text values delimited (e.g., with single or double quotes)?? 

    Now back to DTS.  It sounds from your replies that you have control of the table definition (sometimes you define the column as nvarchar - sometimes as a datetime).  True?  On the DTS designer - are you using (have you used) just a single VBScript task (that is, you are writing script code to open the ftp'd file, read it record-by-record, and then for each record - parse the fields, fill an ADO rs, etc.)??  Is that how you are doing it?? 

    Or does your DTS package use a Text File source connection, a SQL Server connection, and a Transform Data task in between? 

    Thanks for your patience - but the better I understand what you are doing, the better chance I have of helping.  Thanks.


    Mike at the Mill

  • Mike, the fields are delimited (by a ~, long story), there are some numerics mixed with text but I am first importing everything as text (nvarchar, actually, except for three loooong text fields which are ntext).  The text values are not quote delimited.

    The way I did the table definition was to create a new table and assign the data types I wanted.  Since this will be a recurring import, there was no reason to set it up to create the table each time...besides, there are no column names in the text file and there are a LOT of columns (107 I think).  As to the number of steps...I was first trying to do it with multiple steps, i.e. one to import into the DB as all text (or nvarchar) columns, then a second to concatenate and transform the dates, etc.  I DID try doing it all in one step, but that didn't work either (same 'type mismatch' error, so I'm fairly certain my VB code was at least decent).  I figured that if I can get it to work in several seperate steps, I would be able to shove them all together into one once it was debugged.  By the way, I checked the date format on the Server and it is set to US standard (someone told me that might have been the issue, but no...and besides, I can manipulate the datetime string to anything it needs to be and that hasn't helped).

    Thanks again for your help...I know it's frustrating trying to answer these types of questions without seeing all of it in front of you, so please ask me anything else I might have left out...

    I have this weird feeling it's going to wind up being something really, really stupid...but if that's the case, I should have stumbled on it by now!

    Chris

     

  • Hi Chris,

    Here are a couple of suggestions....

      1. Firstly, is there anyway you could get the initial output file to output the dates in a long format? i.e. 1 Jan 2004. This would help considerably, as there would be no issues around the format of the date, which coninsedently, I think is your main problem. I think that some of your dates have gone in as mm/dd/yyyy and others as dd/mm/yyyy. If you could force the initial export to long date format, you could work around this.

      2. Second option, have you tried something like

             SELECT CAST(CONVERT(char(8), getdate(), 112) as datetime)

    Could you possibly post a row from the extract.

    I had a similar problem last week, I was faced with a problem whereby the column had dates in different formats. i.e. dd/mm/yyyy, mm/dd/yyyy, 1-Jan-2004, 1 Jan 2004 and 1 January 2004. It was a real pain transforming these into the correct format, in fact, it took a rather large query to convert all of them... I could pass this query onto you if you are interested.

    Anton

  • Hi Chris - I had to go home and go to sleep for a while.  OK, so suppose you have imported from the file to a staging table whose columns are mostly nvarchar.  Some of these columns are supposed to contain valid dates - e.g., let's say COL23 is an nvarchar column that is supposed to contain a date value. 

    Have you thought about using the ISDATE function on the column in your staging table???  This will return a 1 if the value can be converted to a datetime, a zero if it can't. 

    Just like AGS says above - the problem probably lies in the format of the date values IN the file (some are OK and some aren't).  And don't worry about it winding up to be something really, really stupid.  Every problem is really stupid - AFTER you figure out the answer.

    Regahdz.


    Mike at the Mill

  • Mike, Anton, Thanks for looking into this...

    Anton: 1) I'm limited in what I can do with the output as the MF people are quite ornery about programming changes (well, what they perceive to be, anyway)...not sure it's even possible, actually.  As far as the formats, I rearrange all of the date strings to be something approaching normal and have done it consistently across every potential datetime column.

    2) Not with that particular syntax, but similar...I've tried Convert, I've tried CDate in the VB code, I've tried a lot of things like this, but probably not everything!

    Here's a row:  (remember, ~ is the delimiter and not every record will have data in every column...you can tell this is not a relational DB!)

    ACSI000083~ACSI-II -EXPAND SESSION LIMIT FROM 60 TO 100~SIMPLE~BAS~JIM BRADY~5-283-4930~ACSI~N829~BACKOUT CHANGE AND CONTINUE PRODUCTION~To accomodate the next phase of the OxxxxN deployment, we are expanding the number of inbound sessions and outbound sessions from 60 to 100. The procedure to implement this change will be to shutdown AxxI using the normal shutdown procedure. Release  job SxxxxxAI  ( JOBID = JOBxxx90 ) Start AxxI using the started task.~The backout instructions are: Shutdown AxxI using the normal shutdown procedure. Release job_ SxxxxHAO ( JOBID = JOBxxx98 ) Start AxxI using the started task.~2004-04-25 01:00~2004-04-25 01:15~2004-04-25 01:00~2004-04-25 01:15~2004-04-25 01:00~2004-04-25 01:15~~~N~ACSI~04-25-2004~01:00~04-25-2004~01:15~PROJECT LEADER~ACSIAPPR~DBM1RJO~2004-04-22~17:25~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~OCCG   (x8555)~OPSOCC~OPS1ACT~2004-04-23~07:33~~~~~~PACKAGE CREATED~2004-04-20~13:55:35~PACKAGE FROZEN~2004-04-22~13:42:35~PACKAGE REJECTED~~~PACKAGE APPROVED~2004-04-23~07:33:37~PACKAGE DISTRIBUTED~2004-04-23~07:33:57~PACKAGE DISTRIBUTION ACKNOWLEDGED~2004-04-23~07:34:03~PACKAGE INSTALLED~2004-04-25~01:03:27~PACKAGE BASELINED~2004-04-25~01:03:29~TEMPORARY CHANGE CLOSED~~~PACKAGE BACKED OUT~~~PACKAGE REVERTED~~~PACKAGE MEMO DELETED~~~

    If you wouldn't mind passing that query along, at least it might give me some ideas!

    Mike: You know, the ISDate function is a good idea.  I had tested some of the data at one point using it, after I had moved around the string to be like MS says it should be in one of the KB articles, but it said they were NOT dates.  So I tried a different format and those came back as True...but still wouldn't convert.  It's impossible to find clear info on MS KB about the 'preferred' format for dates...one article says to use mm/dd/yyyy hh:nn:ss.fffffffff which is a little ludicrous (who has a clock with that kind of precision?)!

    I guess a question would be what date format have any of you used successfully in a string to datetime conversion? 

    Thanks!

    Chris

  • Hi Chris,

    Well, remember I told you that I had a dog of a problem, well, here is the query that I wrote to sort out the problem. You are going to have to change some things in, obviously, but give it a bash.

    Import the data and then try this query to see if you managed to pull the rows back. Just change dtInserted to you Data Column and don't forget to change the table name. Let me know how it pans out. 

    SELECT

     P2.dtInserted,

     CAST(CONVERT(char(11), P2.LastChange, 112) as datetime)

    FROM

     (

     SELECT

      P1.dtInserted,

      CASE

       WHEN PATINDEX('%.%', P1.dtInsertedChange2) > 0 THEN PARSENAME(P1.dtInsertedChange2, 2) + ' ' + PARSENAME(P1.dtInsertedChange2, 3) + ' ' + PARSENAME(P1.dtInsertedChange2, 1)

       WHEN PATINDEX('%.%', P1.dtInsertedChange2) = 0 THEN P1.dtInsertedChange2

      END AS LastChange

      

     FROM

      (

      SELECT

       P0.dtInserted,

       CASE LEFT(P0.dtInsertedChange,1)

        WHEN 'J' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(P0.dtInsertedChange,12))), ' ', '.'), '..', '.')

        WHEN 'F' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(P0.dtInsertedChange,12))), ' ', '.'), '..', '.')

        WHEN 'M' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(P0.dtInsertedChange,12))), ' ', '.'), '..', '.')

        WHEN 'A' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(P0.dtInsertedChange,12))), ' ', '.'), '..', '.')

        WHEN 'S' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(P0.dtInsertedChange,12))), ' ', '.'), '..', '.')

        WHEN 'O' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(P0.dtInsertedChange,12))), ' ', '.'), '..', '.')

        WHEN 'N' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(P0.dtInsertedChange,12))), ' ', '.'), '..', '.')

        WHEN 'D' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(P0.dtInsertedChange,12))), ' ', '.'), '..', '.')

        ELSE P0.dtInsertedChange

       END AS dtInsertedChange2

      FROM

       ( 

       SELECT

        dtInserted,

        CASE

         WHEN PATINDEX('%/%', dtInserted) > 0 THEN LEFT(RIGHT(dtInserted, 13),4) + '-' + RIGHT(LEFT(dtInserted,5),2) + '-' + LEFT(dtInserted,2)

         WHEN PATINDEX('%/%', dtInserted) = 0 THEN REPLACE(dtInserted, '-', ' ')

         ELSE

          CASE LEFT(dtInserted,1)

           WHEN 'J' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(dtInserted,12))), ' ', '.'), '..', '.')

           WHEN 'F' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(dtInserted,12))), ' ', '.'), '..', '.')

           WHEN 'M' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(dtInserted,12))), ' ', '.'), '..', '.')

           WHEN 'A' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(dtInserted,12))), ' ', '.'), '..', '.')

           WHEN 'S' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(dtInserted,12))), ' ', '.'), '..', '.')

           WHEN 'O' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(dtInserted,12))), ' ', '.'), '..', '.')

           WHEN 'N' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(dtInserted,12))), ' ', '.'), '..', '.')

           WHEN 'D' THEN REPLACE(REPLACE(RTRIM(LTRIM(LEFT(dtInserted,12))), ' ', '.'), '..', '.')

          END

        END AS dtInsertedChange

       FROM

        pmTasks

      &nbsp P0

     &nbsp P1

    &nbsp P2

    Anton

  • Well, I don't know exactly what to do, but will give you some tips:

    <<< CODE 1 >>>

    declare @s-2 varchar(32)

    declare @dt datetime

    declare @holder table ( datetimevalue varchar(32) , isDate bit )

    select @s-2 = '1/1/1999'

    select @s-2 = 'March 12, 2003'

    select @s-2 = '1/1/1999'

    if ISDATE(@s) > 0

     BEGIN

      select @dt = convert(datetime, @s-2)

      print convert(varchar(32), @dt) + ' << -- IS a date'

      INSERT INTO @holder values (@s , 1 )

     END

    else

     BEGIN

      print @s-2 + ' is not a date'

      INSERT INTO @holder values (@s , 0 )

     END

    select * from @holder

    <<< / CODE 1 >>>

     

    This shows how to use the ISDATE, which you probably already know.

    I included the @holder table so you can possible track bad dates, and maybe put them into an error table or something, as the ones which you have to go and hand edit.

     

    The other idea is to use a User Defined Function.

    A udf is a function at kindof the most purest of forms.  you pass in an arg, and you get something back.  you can't use global variables or anything, you pass it in, you get something back.  with a sql server udf, you can check values in the database (read only), you can't update the database inside the udf.

    Here is an example:  This is a generic pubs example I keep around for showing how to use a udf.

    When I import data into the database from a text file , I put all the data in a table like tblWorkingImportTable.  And then I move it to the "real" tables from there.  What you could do is (using the udf below), write some extra logic to handle weirdo dates.  I do this often with "close match" string comparisons.

    For example, the udf could take @dateValue varchar(32) as a parameter.  And return a datetime.

    If the @dateValue passes the ISDATE test, just convert it and send it back.

    Lets say the @dateValue is '01*30*2004' (for whatever reason).  You know that this means January 30 2004, but the ISDATE fails on it.

    You can write logic which says something like'

    ISDATE fails.  Check for '*' character.  If found, then convert '*' to '/', and then try and convert.

    You can write as many weirdo rules in an udf as you would like.

    I work in a School environment, and I used udf to "match" school names.

    I have to do something like say "ELEM" is the same as "ELEMENTARY" and "JR" is the same as "JUNIOR" (as in  Jr high school).

    I do all this mess in a udf.  I pass in the string of the School (like "King Elem") and I pass back the school_id .. but I used my loose/weirdo rules to pass find a match.

    Sometimes you don't find a match, and I will write this out to an error table, which again, is my way of saying "I gotta go look at this by hand".

     

    Good luck.

     

    << Code 2 >>>

     

    --"inline" udf example

    Use

    pubs

     

    if exists

    (select * from sysobjects where id = object_id('fnc_is_invoice_and_qty_greater_than_10') and xtype = 'FN')

    drop function fnc_is_invoice_and_qty_greater_than_10

    GO

    CREATE

    FUNCTION fnc_is_invoice_and_qty_greater_than_10 (@payterms varchar(64) , @qty int)

     

    --this is just a demonstration udf which takes 2 parameters and decides whether or not

    --the @payterms has "invoice" in its text,and whether or not the qty is greater than 10

    --this is a made up business rule, that makes no sense whatsoever

     

    returns

    int

    as

    begin

    declare @returnValue int

    if CHARINDEX ( 'invoice' , @payterms ) > 0 AND @qty > 10

    BEGIN

    select @returnValue = 1

    END

    ELSE

    BEGIN

    select @returnValue = 0

    END

     

    RETURN(@returnValue)

    end

     

     

    GO

     

     

     

     

     

     

    SET NOCOUNT ON

     

     

     

     

     

    /*

    SELECT *

    FROM stores INNER JOIN

    sales ON stores.stor_id = sales.stor_id

    where qty > 10 and CHARINDEX ( 'invoice' , payterms) > 0

    */

     

     

    --Let's declare a table, notice the last column, this is the column

    --we'll later put out "business rule" return value in

    declare

    @workingtable table

    (

    stor_id

    varchar(64) ,

    stor_name

    varchar(64) ,

    stor_address

    varchar(64) ,

    city

    varchar(64) ,

    state

    varchar(64) ,

    zip

    varchar(64) ,

    ord_num

    varchar(64) ,

    ord_date

    varchar(64) ,

    qty

    varchar(64) ,

    payterms

    varchar(64) ,

    title_id

    varchar(64) ,

    --

    isInvoiceAndQtyGreaterThan10

    bit NULL

    )

    INSERT INTO

    @workingtable

    (

    stores.stor_id ,

    stor_name ,

    stor_address ,

    city ,

    state ,

    zip ,

    ord_num ,

    ord_date ,

    qty ,

    payterms ,

    title_id ,

    isInvoiceAndQtyGreaterThan10

    )

    SELECT

    stores.stor_id ,

    stor_name ,

    stor_address ,

    city ,

    state ,

    zip ,

    ord_num ,

    ord_date ,

    qty ,

    payterms ,

    title_id ,

    --NOTICE that this column gets populated with the RESULT of our business rule

    ark.fnc_is_invoice_and_qty_greater_than_10 (payterms , qty )

    FROM

    stores

    INNER JOIN

    sales

    ON stores.stor_id = sales.stor_id

     

    --Now let's look at the results, it looks like 3 rows

    --meet our "business rule"

    select

    isInvoiceAndQtyGreaterThan10, * from @workingtable

    order by

    isInvoiceAndQtyGreaterThan10 desc

     

     

    --This is a simple demo on how you can run calculations

    --or business rules "in line" instead of looping over the entries

    --with a cursor and doing checks at a cursor level

    --The "in line" part means that

    --one of the columns in your return set can be the value of a function

    --and you can implement any business rule in the function

    --The biggest things to determine are "what data" do i need to make my decision.

    --This example says

    --I need "qty" "payterms" to make my decision

    --These become parameters to the udf

    --then this example creates a "holder" column (isInvoiceAndQtyGreaterThan10)

    --and will be filled with the results of the udf

     

     

     

     

    SET NOCOUNT OFF

     

    <<< / Code 2 >>>

  • Thanks, Anton...I'll take a crack at it!

     

  • This may be a stupid reply and you have probably already handled this, but VB or VBScript will not recognize the yyyy-mm-dd date format. When importing data from the AS400 to SQL Server via a VB component, I had to convert the date to parse properly using the DateSerial function. The Mid or even Split function (with a "-" delimeter in your case) would give you the individual parts which can then be put back together in the DateSerial function. Hope this helps.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply