September 25, 2007 at 12:39 pm
Just like the subject says - we have data coming from other systems where people have copied/pasted data from various programs, emails, web sites, etc and those include characters that are not found on your standard US keyboard layout. While that isn't always a problem, some of those characters cause exceptions in our code because they are not handled properly. While I realize that we can update the code at some point to handle these characters more gracefully, is there any relatively easy way to scrub all char/varchar columns in the database to replace items such as Vertical Tabs or similar entries with a space?
Anyone done this before and willing to share the T-SQL? We have a little luxury here in being able to scrub incoming data if the process is efficient enough to fit in our pipeline. We just need to somehow do it prior to import into the production tables or otherwise "fix" the data in those tables.
Thanks in advance for the help.
-Peter
September 25, 2007 at 12:51 pm
yes there is an easy efficient way;
look at this post:
http://www.sqlservercentral.com/Forums/Topic398428-338-1.aspx
there, someone wanted to replace extended ascii with the html encode; ie '‘Special Digital Data Service Obligation’ got some characters stripped out, but left other items in place; your problem is exactly the same, except you'd substitute an empty string for the value instead.
Lowell
September 26, 2007 at 9:54 am
Possibly the "most straightforward" might be to build a CLR UDF implementation of Regex.replace.
Imports System
Imports System.Data
Imports System.Data.SqlClient
Imports System.Data.SqlTypes
Imports Microsoft.SqlServer.Server
Imports System.Text.RegularExpressions
Partial Public Class UserDefinedFunctions
<Microsoft.SqlServer.Server.SqlFunction()>_
Public Shared Function RegexReplace(ByVal input As SqlChars, ByVal pattern As SqlString, ByVal rep As SqlString) As SqlString
' Add your code here
Dim rex As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex(pattern.Value, optionS)
Return New SqlString(rex.Replace(New String(input.Value), rep.Value))
End Function
End Class
Publish that to your server - and you can then use syntax like:
select dbo.regexreplace('a cat jumped over the fence','\x0B',' ')
where (if my ASCII isn't too rusty) \x0B= ASCII 11 = vertical tab
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply