How to remove special chars?

  • Hi everyone,

    So I have a xml file from which I am trying to import data into my sql tables. I noticed there are some special characters inside the file due to which I am unable to import data properly.

    Look at the below screenshot:

    Can anyone tell me what are there square boxes and how can I remove them?

    I am using script component in ssis package and i am able to remove other characters like commas, quotation marks etc but what is this character below?

    4444

     

    Complete tag with all xml nodes:

     

    <FinancialSanctionsTarget>
    <Name6>KIRILLOVA</Name6>
    <name1>Anastasiya</name1>
    <name2>Sergeyevna</name2>
    <name3/>
    <name4/>
    <name5/>
    <Title/>
    <NameNonLatinScript>КИРИЛЛОВА Анастасия Сергеевна</NameNonLatinScript>
    <NonLatinScriptType>Cyrillic</NonLatinScriptType>
    <NonLatinScriptLanguage>Russian</NonLatinScriptLanguage>
    <Address1/>
    <Address2/>
    <Address3/>
    <Address4/>
    <Address5/>
    <Address6/>
    <PostCode/>
    <Country> </Country>
    <OtherInformation/>
    <GroupTypeDescription>Individual</GroupTypeDescription>
    <AliasType>Primary name</AliasType>
    <AliasQuality/>
    <RegimeName>Russia</RegimeName>
    <DateListed>2022-03-15T00:00:00</DateListed>
    <DateDesignated>2022-03-15T00:00:00</DateDesignated>
    <UKSanctionsListRef>RUS0793</UKSanctionsListRef>
    <UKStatementOfReasons>Anastasiya Sergeyevna KIRILLOVA (hereafter KIRILLOVA) is a founder of InfoRos, an organisation linked to the Government of Russia which spreads disinformation. In her role with InfoRos KIRILLOVA has provided support for and promoted actions and policies which destabilise Ukraine or undermine or threaten the territorial integrity, sovereignty or independence of Ukraine.</UKStatementOfReasons>
    <PhoneNumber>+7 | 718-84-11 </PhoneNumber>
    <Website/>
    <EmailAddress>kirillova@inforos.ru</EmailAddress>
    <Individual_DateOfBirth>31/12/1986</Individual_DateOfBirth>
    <Individual_TownOfBirth/>
    <Individual_CountryOfBirth/>
    <Individual_Nationality>Russia</Individual_Nationality>
    <Individual_PassportNumber/>
    <Individual_PassportDetails/>
    <Individual_NINumber/>
    <Individual_NIDetails/>
    <Individual_Position/>
    <Individual_Gender>Female</Individual_Gender>
    <Entity_Type/>
    <Entity_Subsidiaries/>
    <Entity_ParentCompany/>
    <Entity_BusinessRegNumber/>
    <Ship_IMONumber/>
    <Ship_CurrentOwners/>
    <Ship_PreviousOwners/>
    <Ship_Flag/>
    <Ship_PreviousFlags/>
    <Ship_Type/>
    <Ship_Tonnage/>
    <Ship_Length/>
    <Ship_YearBuilt/>
    <Ship_HullIdentificationNumber/>
    <UNRef/>
    <LastUpdated>2022-05-09T00:00:00</LastUpdated>
    <GroupID>14744</GroupID>
    <GrpStatus>A</GrpStatus>
    <GroupStatus>Asset Freeze Targets</GroupStatus>
    <ListingType>UK</ListingType>
    </FinancialSanctionsTarget>

    • This topic was modified 1 year, 8 months ago by  Jobs90312.
  • open the file with a hex editor and see what the values are - likely they are (were) in a codepage that you aren't using and therefore it seems wrong.

    put the values here and we may be able to advise further - but once  you have the hex values you can use those on the replace as well.

  • I copied and pasted just the phone number in question from the XML you posted and put it in the following code.  The "special characters" are an ASCII 160... which is the ASCII code for a "hard space".

     SELECT  t.N
    ,Visible = SUBSTRING(v.String,t.N,1)
    ,AsciiValue = ASCII(SUBSTRING(v.String,t.N,1))
    FROM (VALUES('+7 | 718-84-11 '))v(String)
    CROSS APPLY dbo.fnTally(1,LEN(v.String))t
    ;

    Results...

    You can get the "fnTally" sequence generator from the similarly named link in my signature line below.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I would use datalength(v.String) so the training space at the end shows up too, so nobody checking your script actually has to doublecheck the behavior of trailing spaces.

    ( btw: this ascii hard space is not handled like a regular space when it is at the end of a string! )

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply