Excel to Text file

  • Hi,

    I'm working on a package where i'm transferring data from excel to a flat file.

    My problem is the date column sometimes consist of character data which i have to send it as a error output. If the give the following data,

    date1date2amount

    str12/5/2009abcd

    str22/6/200925000

    str325/6/0952000

    str4abcdxyz

    str5efghrqp

    str6ijklabcd

    25/02/082/2/2009

    I'm getting

    str1,,abcd

    str2,,

    str3,25/6/09,

    str4,abcd,xyz

    str5,efgh,rqp

    str6,ijkl,abcd

    25/02/08,,

    to get the mixed set of data from Excel, we can include IMEX in the excel connection after doing that i got the result as

    str1,39849,abcd

    str2,39850,25000

    str3,25/6/09,52000

    str4,abcd,xyz

    str5,efgh,rqp

    str6,ijkl,abcd

    25/02/08,39846,

    In the first case i didn't get the date values and in the second case i'm getting as some numbers like '39850'. '39846' etc.

    but i want the values how ever it is in the excel sheet, so that the same value i can return to the error output and name the error also. (example "invalid date")

    how to achieve this

    thanks

    regards

    anamika,

  • I would guess that those 'numbers' you are referring to are the number of days since 1/1/1900 (or 1899, or whatever the base date is) - which is how Excel (and SQL Server) store dates internally.

    The remaining dates (ie, those items which still look like dates) have not been converted to numbers because they are invalid (according to the American mm/dd/yyyy format).

    As your dates are in UK format, that implies that all of the dates are being misinterpreted and read as m/d/y.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • To gain an understanding of how Excel stores dates read this blog entry by Brian Jones. It contains an excellent discussion of how and why date information is actuall stored by Excel

    http://blogs.msdn.com/brian_jones/archive/2006/10/25/spreadsheetml-dates.aspx

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Hi,

    I would guess that those 'numbers' you are referring to are the number of days since 1/1/1900 (or 1899, or whatever the base date is) - which is how Excel (and SQL Server) store dates internally.

    The remaining dates (ie, those items which still look like dates) have not been converted to numbers because they are invalid (according to the American mm/dd/yyyy format).

    As your dates are in UK format, that implies that all of the dates are being misinterpreted and read as m/d/y.

    The 'numbers' could be referring the dates stored internally. but does it change from system to system, depending on the base date of the system? or is there a way to get into a exact date? yes we follow dd/mm/yy format apart from this format if anything comes i should send it to error output.

    To gain an understanding of how Excel stores dates read this blog entry by Brian Jones. It contains an excellent discussion of how and why date information is actuall stored by Excel

    http://blogs.msdn.com/brian_jones/archive/2006/10/25/spreadsheetml-dates.aspx

    If everything seems to be going well, you have obviously overlooked something.

    Thanks for the link.

    by the way, it's not going well, our testing team put all junk characters and test the ETL. is it the correct way to test? or is there a mechanism / method to test? they put character data in place of numeric, numeric in place of date etc... please suggest some good testing methodology to test dates in Excel.

    Thanks,

    Regards,

    Anamika

  • When we had a similar problem we used a macro to convert our excel files to CSV format.

    Then we used BULK INSERT to import them to temporary tables.

    Then we selected the valid rows from the temporary tables.

    This is more of a hack than anything else, so I'm not sure how easy it is to implement

    in production standard SSIS package.

  • How do you convert to flat file? in my project itz been placed in a folder. the package should pick and do the job automatically.

    the files will be uploaded by the clients. so how do you employ a macro in it?

    or how to go about it?

    thanks,

  • [font="Comic Sans MS"]

    Anamika - one way to achive this is enforce a strong template for the excel sheet you are using.

    It shows how you set the property of a particular column (string - currency - date etc).. you can restrict by selecting the column and do a 'format cell'.. you can format it to be a 'text' and it will be the same as how the test team has entered the data..

    [/font]

    [font="Comic Sans MS"]--
    Sabya[/font]

  • Hi Sabya,

    thatz what not happening. Eventhough i set the columns as text, as soon as they paste the 'date' data into it the properties are becoming date. and when you insert a text value in between, it becomes null or empty while coming to SSIS. it is the behaviour of Excel.

    the solution could be saving the file in csv format. but how to do this programmatically, without manual intervention.

    any guidance, links are highly appreciated.

    thanks,

    regards

    anamika

  • Hi friend. the solution is please change the excel registry.

    Steps.

    go to run.

    type regedit.

    to find "TypeGuessRows"

    to change the value 8 to 0

    please try now.

    by Shanmugavel N.

    shanmugavel_1984@yahoo.co.in

    Chennai.

    ph 9994045200

  • Hi,

    I tried your suggestion. amazing it works well. but still there is one concern.

    the date values it is giving as numbers like

    31810

    39866,,39480

    39532,,39115

    39866,,40211

    for example i entered 2/2/2010 in excel and itz been interpreted as 40211. if i try to get the date in sql server using

    select datepart(dd,40211),datepart(mm,40211),datepart(yy,40211)

    i'm getting

    (No column name)(No column name)(No column name)

    422007

    which is not correct.

    how to get the correct dates?

    Thanks,

    Regards,

    Anamika

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply