Parsing Excel files to a table

  • I have a bunch of 2007 excel files, each with 3 worksheet, each sheet has 5 header lines that I need to parse into 1st 5 columns in a tableX. 6th line is the header name and data starts at 7th line.

    e.g.

    (1st 5 lines has 2 columns in excel)

    First_NameJohn

    Last_NameSmith

    Date8/9/2010

    Time14:03:03

    LabelStudent

    ***End_of_Header***

    (now the following lines has 3 columns)

    X_ValueY_ValueComment

    01test

    10.11testing

    20.23

    I'd like to have a output table like the following, (note: how can I get the excelfilename as well?)

    ExcelFileName,FirstName,LastName,DateTime,Label,X_Value,Y_Value,Comment

    When I use Data Flow Task with Excel Source, I couldn't get the value from the top 5 rows. Can anyone help me with this or point me to an example for excel file processing with format like this?

    Thanks a lot!

  • Any input?

    Thanks!

  • I think you are going to have to create a temp table to load the data into. Load the data into a table along the lines of

    CREATE TABLE temp (

    Column1 varchar(255),

    Column2 varchar(255),

    Column3 varchar(255) etc

    )

    Then you can parse it in a stored procedure (which you can also call from the SSIS package). Then it because somewhat trivial to handle the first 5 rows one way and the rest another.

    To get your execl file name put in a Derived Column object and create a new column to be loaded into your temp table. Then it depends on how you are getting the file name for your connection manager. Lets say you are using an expression, put that same formula into your derived column (more maintenance here), if you are first storing the file name into a variable then just put your variable as the new column etc.

    So now you have a table holding your data like this:

    Column1 Column2 Column3 Column4

    ----------------------------------------------

    FileName First_Name John NULL

    FileName Last_Name Smith NULL

    FileName Date 8/9/2010 NULL

    FileName Time 14:03:03 NULL

    FileName Label Student NULL

    FileName X_Value Y_Value Comment

    FileName 0 1 test

    FileName 1 0.11 testing

    FileName 2 0.23 NULL

    Now just parse it however you need to.

    Kenneth FisherI was once offered a wizards hat but it got in the way of my dunce cap.--------------------------------------------------------------------------------For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]Link to my Blog Post --> www.SQLStudies.com[/url]

  • Thanks!

    I figured that I could also use conditional split to get the values.

    However, I have different problems now:

    1) I am looping through the excel files and then sheets to get data. And I have two variables: FileName and SheetName.

    In the ADO Net Source specification, i was able to refer to my ExcelFile as ADO.NET connection manager. However, it's asking me to specify the Name of table or the view. Is there anyway to use the Sheet variable?

  • ktlady (8/12/2010)


    1) I am looping through the excel files and then sheets to get data. And I have two variables: FileName and SheetName.

    In the ADO Net Source specification, i was able to refer to my ExcelFile as ADO.NET connection manager. However, it's asking me to specify the Name of table or the view. Is there anyway to use the Sheet variable?

    I'm not really familiar with ADO.NET, but can't you use expressions (where you use your variable) to set the correct property?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply