Openrowset & removing unwanted characters...

  • Folks I'm wondering can you help me here. I have inherited a system that populates a table from Excel using the OPENROWSET command. The table in question is called: tbl_raw_data and then the loaded data is moved on elsewhere...

    The excel speasheet can and does change column names and number of rows over time.

    The code is something like this:

    SET @varSQL = 'SELECT * INTO TBL_RAW_DATA FROM OPENROWSET('''+@driver+''',' + @varConnectionString + ', ''SELECT * FROM ' + @varSheetName + ''')'
    EXEC (@varSQL)

    I'm new to openrowset command and am wondering is there a "nice" way of altering the command above to parse out carriage returns or line-feed characters ( char(13) & char(10) respectively) as their presence messing things up later on. Lets assume they cannot be removed from the Excel before load.

    I'm open to suggestion here but ideally looking for something quick & easy. I know I can probably parse thru the table after population and remove the offending characters & re-write but that could be time consuming....hence looking for nice alternatives 🙂

    Thanks in advance for any suggestions...

    J.

  • nothing you can do really within sql other than cleaning up after loading.

    you may be able to get the data cleaned in excel itself (ask the owner of it to do it for you - and potentially to avoid other excel issues ask them to generate the file as a "proper" CSV file.

    and I do have to query why do the "CR+LF" mess up at a later stage - what type of use are you giving to this data?

  • Thanks for the speedy reply. The type of use of is in 2 ways

    Way 1): Extract at a later stage. The Cr+lf can & does muck up nice formatting later (I can of course parse at that time but this adds lengh of command to the process which can cause a known overflow issue)

    Way 2): Perhaps more important, record identification. Certain fields are used to identify the original record. If the search record is not an exact match then no identification will occur. Best not to allow these characters into the system in the first place IMHO...

    I guess I will have to investigate parsing the table after load. Could be expensive but I don't see an alternative at this stage....

    J.

  • If you are familiar with PowerQuery, another possibility would be to create another spreadsheet which consumes the data from the original spreadsheet and does some manipulation in the process. Your OPENROWSET would use this processed data as its source.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply