Importing / Parsing text file with different formatted rows?

  • Hello,

    I have a file that I need to parse.

    I know how to do this in VBA and my Excel XP. However, then I have to create a way (not sure of at the moment), to update my SQL Server tables from Excel.

    I am used to writing stored procedures with set based T-SQL.

    I'm not sure if using SQL Server 2005 Express if I can create something (i.e. stored procedure) to go through the file row by row and parse the data.

    I was given this file to work on.

    There is a row type that is for the teacher (0x100).

    The second row is for the exam the teacher is giving (0x100).

    The third row is specifics about the type of test (0x200).

    The fourth row is the actual data pertaining to the test.

    Rows 2 - 4 keep getting repeating through the text file for various tests given.

    I'm trying to see if this better handled in Excel/VBA and then create a connection a SQL Server Table or try to do this in SQL Server. Again, I'm using the Express edition so no SSIS.

    Thanks.

    Here is a sample of the file:

    0x100,Doe,,Jane,Chicago DP,Female,5353

    0x100,1044,TeacherTest,First Page,Feb 17. 2008

    0x200,387,4,299,MathTest

    1,0.10000,11.850323,14.847638,0.000000,0.000000,41,82,41

    2,0.20000,11.711373,15.859981,8.555257,6.292381,41,82,41

    3,0.30000,11.830472,15.423284,8.872853,6.391631,41,82,41

    4,0.40000,11.790774,15.641631,9.031651,6.490880,41,82,41

    5,0.50000,11.433478,15.343884,9.210299,6.629828,41,82,41

    6,0.60000,11.016632,15.641629,9.508046,6.669528,41,82,41

    7,0.70000,10.619636,15.224785,9.607297,6.788627,41,83,42

    8,0.80000,10.778435,14.708690,9.607296,6.669528,41,83,42

    0x100,1044,TeacherTest,Second Page,Feb 17. 2008

    0x200,387,4,299,ScienceTest

    1,0.10000,11.890022,13.974250,22.013412,5.478541,42,7,35

    2,0.20000,12.207619,14.271996,22.331009,5.538090,42,5,37

    3,0.30000,11.909871,14.629292,22.946352,5.538090,42,5,37

    4,0.40000,11.373928,14.748389,22.033264,5.716738,41,5,36

    5,0.50000,10.877683,14.728539,21.060623,5.835837,41,5,36

    6,0.60000,10.818134,14.430794,21.795065,6.371781,41,5,36

    7,0.70000,10.937232,14.113197,21.953863,6.709227,41,5,36

    8,0.80000,11.016631,14.271998,21.934011,7.185622,41,5,36

    0x100,1044,TeacherTest,Third Page,Feb 17. 2008

    0x200,387,4,299,Geography Test

    1,0.10000,12.743564,14.331544,6.629828,5.835836,41,-4,37

    2,0.20000,12.723714,14.113197,6.669528,5.895386,41,-3,38

    3,0.30000,12.961911,13.994099,6.609978,6.133584,41,-3,38

    4,0.40000,12.822963,14.371245,6.351931,6.332081,41,-4,37

    5,0.50000,12.684013,14.113198,6.371781,6.590128,41,-4,37

    6,0.60000,12.604614,14.252147,6.788626,6.590128,41,-2,39

    7,0.70000,12.306867,14.768242,6.729077,6.649678,41,-3,38

    8,0.80000,12.406116,14.609443,6.411480,6.927575,41,-3,38

    0x100,1044,TeacherTest,Fourth Page,Feb 17. 2008

    0x200,387,4,299,Language Arts Test

    1,0.10000,12.584764,14.013949,12.822961,13.418454,45,-3,42

    2,0.20000,12.703862,14.728541,12.902362,13.636803,44,-3,41

    3,0.30000,13.398605,15.085837,12.207620,13.339057,44,-1,43

    4,0.40000,13.656651,14.986588,12.128221,13.160408,45,-1,44

    5,0.50000,14.232296,13.914701,13.160409,12.624463,44,-1,43

    6,0.60000,14.311696,14.053649,12.207619,12.763412,44,-4,40

    7,0.70000,14.331544,13.458156,11.830474,13.100857,45,-4,41

    8,0.80000,14.351395,13.239807,11.493027,13.259658,45,-4,41

    Things will work out.  Get back up, change some parameters and recode.

  • It would be helpful if you posted the CREATE statements for the tables you're trying to target... 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Also, have 0x100 as the row identifier for both the teacher and the test page does not seem logical... are you sure about that?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    Trust me. I said the same thing. Why does the first and second row have to start with 0x100? That was the way the application was written that outputs the text file.

    I had thought about posting the schema for the tables. But then I would have post the schemas and show a parsed file with the corresponding fields. That is going to take some time. I will do that this afternoon after church.

    However, I guess my questions is how would you implement something like this? Is there a way to import this into a one column table and then go row by row and look at the data.

    I have always been told cursors were bad. In fact, I have never had to create a cursor. Would this be a time?

    Should I skip trying to do this in SQL Server and just use VBA in Excel?

    Again, I'll post the schemas in a little while.

    Thanks.

    Things will work out.  Get back up, change some parameters and recode.

  • I do this same thing with a lot of mixed record files... they're usually fixed width instead of delimited, though. Anyway, I bring the whole file in as a single wide column with an IDENTITY column, maybe have a couple of other columns that I can "smear" some data down from the headers with, and I use the same technique for doing that cursor-like processing using single updates as found in the following link...

    http://www.sqlservercentral.com/articles/Advanced+Querying/61716/

    Or course, since you're splitting delimited columns, once you've identified which rows go with which header, you an either re-export/re-import just the detail rows just to split them using BCP or Bulk Insert, or you can use the methods for splitting the data found in the following link...

    http://www.sqlservercentral.com/articles/TSQL/62867/

    ... the section you would be most interested in would be the section with the title of "One Final "Split" Trick with the Tally Table"

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply