parsing data into SQL by dash - what are my bext options ?

  • I would like to parse the following data into a SQL DB.

    ": us-md-annapolis-21403 ("

    The preferred way would be to use a regex to parse by dash, then substring out the characters that i dont want.

    I cant get a regex to work, so i found that Excel do it quite easily, but has a limit of 65k rows, and i have 4 Million records.

    SQL Import does not have a parse feature, i dont know why, it would be useful!

    Any ideas?

    Thanks

  • This was removed by the editor as SPAM

  • I had forgotten about that, thanks 🙂

    Why can you not bulk insert with a delimiter from a table?

    Why does it have to be an external file?

  • You can't BULK INSERT from a table simply because BULK INSERT wasn't designed to do such a thing.

    What you need is a good splitter... please see the following...

    http://www.sqlservercentral.com/articles/Tally+Table/72993/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • you can use SSIS for this.

    Set up a connection to the source and if this is a flat file then you can change the delimitter on the source to a '-' to split out the data. And then use a derived column transform to remove the quotes..

    otherwise if this is a column in a much larger dataset, then you pass this column to the script command and do the split in there with .net code.

  • This was removed by the editor as SPAM

  • I was just offering another suggestion, I am aware of the power of the tally table split. As SSIS can deal with individual rows quite effiecntly i am not sure it is as bad as RBAR on a database.

  • steveb. (10/26/2011)


    I was just offering another suggestion, I am aware of the power of the tally table split. As SSIS can deal with individual rows quite effiecntly i am not sure it is as bad as RBAR on a database.

    Actually, I very much appreciate the suggestion because I don't know much about SSIS. Thanks, Steve.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • (?<Country>.{2})-(?<State>.{2})-(?<City>.+)-(?<ZipCode>.+)\s\(

    This regex worked great!

    I placed it into an SSIS script and worked like a charm!

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply