Filtering bad data on NVARCHAR

  • I am running a query on a field that is a 'nvarchar' and converting it to a number so that I can summarise it.

    Using - select CAST(dbo.TASMANOTIME.OTHrsWrked AS float) from dbo.TASMANOTIME

    Unfortunately due to a poor design at the front end, some of the fields have alphabetical characters or '/ 'symbols etc

    How can I write a script that will ignore all fields that have anything except numeric characters or  a '.' for the decimal point.

    Thanks

    Craig

    ****************

    Thanks for the help, problem sorted with IsNumeric function

     

  • hi

    try this

    select CAST(dbo.TASMANOTIME.OTHrsWrked AS float) from dbo.TASMANOTIME where    ISNUMERIC(OTHrsWrked) = 1

    "Keep Trying"

  • IsNumeric will work, but it will allow through some character data

    SELECT

    IsNumeric ('125D256') -- returns 1

    SELECT IsNumeric ('14545E0') -- returns 1

    SELECT IsNumeric ('345,09.09') -- returns 1

    Using LIKE is a lot more accurate, and a fair bit more messy. Try this.

    select

    cast(dbo.TASMANOTIME.OTHrsWrked AS float) from dbo.TASMANOTIME

    where TASMANOTIME NOT LIKE '%[a-zA-Z,/]%'

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Using LIKE in this manner is going to require putting ever possible character other than 0-9 in the pattern. Fortunately, there's an easier way. Specify a range of 0-9 and look for patterns that contain something other than that. So your range would be [^0-9]. That means not 0-9.

    Read the article I wrote on handling this with PatIndex(): http://www.sqlservercentral.com/columnists/rdavis/practicalusesofpatindex.asp

    For example:

    declare

    @test-2 table (test varchar(10) null)

    insert

    into @test-2 select '7676767776'

    insert

    into @test-2 select '125D256'

    insert

    into @test-2 select '14545E0'

    insert

    into @test-2 select '345,09.09'

     

    Select

    *

    from

    @test-2

    where

    PatIndex('%[^0-9]%', test) = 0

     

     


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • Though that still doesn't anticipate the stupidity that comes up in real life. There could still be situations where the string contains multiple periods, out of place commas, wandering negative signs or currency symbols...

    You may want to make a UDF to handle the errors correctly when converting strings to floats (perhaps have a second parameter with the substitution value on error, in the fashion of the isnull function).

    I did a quick search (convert varchar to float) and it seems that this issue has been raised several times on this board.

  • Good points. Not to mention that there is always the possibility that an all numerical value could be out of range. For example, a string of 100 1's would pass the IsNumeric() test as well as my test posted above.

    declare

    @test-2 table (test varchar(100) null)

    insert

    into @test-2 select '7676767776'

    insert into @test-2 select '125D256'

    insert into @test-2 select '14545E0'

    insert into @test-2 select '345,09.09'

    insert into @test-2 select Replicate('1', 100)

    Select

    *, isnumeric(test)

    from @test-2

    where PatIndex('%[^0-9]%', test) = 0

    Select

    *, isnumeric(test), Cast(test as int)

    from @test-2

    where PatIndex('%[^0-9]%', test) = 0


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • YES!!!

    This has got to be the most inefficient way of doing this EVER!!

    (Because I could... for "sanity" I ignored the out of range testing)

    CREATE FUNCTION dbo.udfIsNumeric(@Source varchar(8000))  
    RETURNS tinyint AS  
    BEGIN
       declare @pos int
       select @Source = ltrim(rtrim(isnull(@Source,'')))
       select @pos = PATINDEX('%[0-9E]%', @Source)
       if (@Source='') or (@pos0)
       begin
    
          if replace(left(@Source,@pos-1), ' ', '') like '%[^-$[ ]]%'
             or replace(left(@Source,@pos-1), ' ', '') like '%[-]%[-]%'  
             or replace(left(@Source,@pos-1), ' ', '') like '%[$]%[$]%'
             return 0
    
          select @Source = substring(@Source, @pos, len(@Source)-@pos+1)  
       end 
    
       declare @newvalue tinyint
       if not @Source like '%[^-0-9.,E]%'
          and not @Source like '%.%.%'
          and not @Source like 'E%'
          and not @Source like '%E'
          and not @Source like '%E%[.]%'
          and not @Source like '%[-]'
          and not @Source like '%[-]%E%'
          and not @Source like '%E_%[-]%'
          and not @Source like '%[^0-9],%'
          and not @Source like '%,_'
          and not @Source like '%,__'
          and not @Source like '%,[^0-9]%'
          and not @Source like '%,[0-9][^0-9]%'
          and not @Source like '%,[0-9][0-9][^0-9]%'
          and not @Source like '%,[0-9][0-9][0-9][0-9]%'
          select @newvalue = 1
       else
          select @newvalue = 0
    
       return @newvalue
    END
    
    CREATE FUNCTION dbo.udfToNumber(@Source varchar(8000), @WhenNotNumeric float)  
    RETURNS float AS  
    BEGIN
       declare @IsNumeric tinyint
       select @IsNumeric = dbo.udfIsNumeric(@Source)
       if (@IsNumeric=0)
          return @WhenNotNumeric
       return convert(float, replace(replace(replace(@Source,' ',''),',',''),'$','')
    END
    
  • My, Grandma!! What a big user defined function you have!!

     


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • Thanks for the help.

  • Many thanks for your help

  • Yeah, that will work better. Thanks. I overlooked the negating of conditions when I read up on LIKE

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • This falls over:

    select dbo.udfToNumber('1,000',-1)

  • Thanks.

    I added a comma (and dollar sign. I know... I'm being provincial. ) remover to that function.

    Getting rid of the unintuitiveness (is that a word?) was the whole point of the exersize.

  • OK. Threw lots of crap it at and it seems to be holding up now.

    Intuitiveness is a word. As for the 'un' prefix, dictionary says it has limitless applications.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply