SSIS 2008 - Remove characters from a text file

  • Hello, I had a DTS job running in 2000 that worked great but now in 2005/2008 with ActiveX scripts gone, I've had trouble finding a way to remove the squares from the text data during the transformation.

    Sample data in file:

    "2SURG"∬㜀∀Ⰰ"0"∬  㘀∀Ⰰ"4"∬㐀∀Ⰰ"2"∬㈀∀Ⰰ"Take"∬圀䤀倀䔀 䌀䠀䰀伀刀䄀匀䌀刀唀䈀∀Ⰰ"200478"∬∀Ⰰ"FLOORSTOCK - 2SURG"

    When I view this in note pad all of those asian type symbols are all squares. I'm not sure why this copied this way but I thought I would leave as it may relevant.

    Any input you have is very much appreciated.

    Thank you! Jamie

  • The squares simply mean they're out of the standard character set for the font.

    There's no gentle way to do this, you need to loop to each character, evaulate the CHAR() function's result, and decide to keep or remove it...

    Or simply replace every CHAR(x) that's not in the set... which might actually be less painful, and require less looping against the data, since you could rowset part of it.

    For an idea of what I'm talking about:

    DECLARE @C INT

    SET @C = 256

    WHILE @C <= 5000

    BEGIN

    UPDATE tbl1 SET DataField = REPLACE( CHAR(@c), DataField, '')

    SET @C = @C + 1

    END


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Some ideas on what can be done:

    DECLARE @test-2 VARCHAR(120)

    DECLARE @Ans VARCHAR(120)

    SET @test-2 = '2SURG"????"0"? ???"4"????"2"????"Take"????? ?????????????"200478"???"FLOORSTOCK]'

    SET @Ans = (REPLACE (@Test ,'?' , ''))

    --CHAR(34) on my code page is the quotation mark

    SET @Ans = (REPLACE(@Ans, CHAR(34),''))

    SELECT @Ans

    Returns:

    2SURG0 42Take 200478FLOORSTOCK]

    Not replacing the quotation marks returns:

    2SURG""0" "4""2""Take" "200478""FLOORSTOCK]

    Replacing quotation marks and consecutive commas with one comma

    2SURG,0, ,4,2,Take, ,200478,FLOORSTOCK]

    Another item to check what code page was being used on the 2000?

    For the AutoTranlation of characters you also want to review these Technet pages.

    SQL 2000

    http://technet.microsoft.com/en-us/library/aa216168(SQL.80).aspx

    SQL Server 2005

    http://technet.microsoft.com/en-us/library/ms131464(SQL.90).aspx

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply