Selecting numbers up to the first non-numeric char from a string

  • How do i select only numbers up the first occurence of a non-numeric string?

    example;

    200k returns 200

    25K-30K returns 25

    2k returns 2

    etc.... Thanks

  • if it is always (K).

    you can simply

    select left(col1,charindex('k',col1))

  • Thanks for the reply. The non-numeric can be anything not just 'K'.

  • here is a user function then

    create FUNCTION RTRIMNONNUMERICS

    (

    @inStr nvarchar(200)

    )

    RETURNS nvarchar(200)

    AS

    BEGIN

    DECLARE @len int;

    DECLARE @pos int;

    DECLARE @asc int;

    set @pos = 1;

    set @len = len(@inStr);

    WHILE (@pos <= @len)

    BEGIN

    set @asc = ascii(substring(@inStr,@pos,1));

    IF (@asc < 48 or @asc > 57)

    BREAK

    ELSE set @pos = @pos + 1

    END

    RETURN left(@inStr,@pos-1)

    END

    GO

  • The loop will tend to incur a rather severe penalty on larger sets. You could use something like this (note that this syntax could simply be used inside of a single SQL query, instead of looping through).

    declare @STR varchar(250)

    set @STR='0102030405069p9987754613146)'

    select left(@str,patindex('%[^0-9]%',@str)-1)

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • very nice

  • This function worked perfectly for what I needed to do. Thank you very much for your help.

  • mmiruru (12/10/2008)


    This function worked perfectly for what I needed to do. Thank you very much for your help.

    First, it's not nice to double post.

    Second, the function may work perfectly for what you needed to do, but it also works very slowly compared to the method that Matt Miller showed. Save yourself a headache later and do it right the first time... don't use the function with the WHILE loop in it..

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I loved the Matt's solution too and learned how to use patindex. But I dont agree about your reasoning. patindex itself is a function and I can not imagine they did not use while or a for loop in it.

    The reason should be using an existing function over a user defined function anytime. I would like to know how you measured the response on these two approaches as well.

    Thanks

  • engintoksoz (12/11/2008)


    I loved the Matt's solution too and learned how to use patindex. But I dont agree about your reasoning. patindex itself is a function and I can not imagine they did not use while or a for loop in it.

    The reason should be using an existing function over a user defined function anytime. I would like to know how you measured the response on these two approaches as well.

    Thanks

    Of course it has a loop in it... but it's at a machine language level which makes it very fast. Explicit WHILE loops are not at a machine language level... and that makes them slow.

    I'll give you some test code soon, so you can see for yourself...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ok... here it is... first, we need to make a UDF from Matt's code to drive a point home later...

    CREATE FUNCTION dbo.MillerCode

    (

    @STR VARCHAR(8000)

    )

    RETURNS VARCHAR(8000)

    AS

    BEGIN

    RETURN (LEFT(@STR,PATINDEX('%[^0-9]%',@STR)-1))

    END

    ... and, we need some test data followed by the measured responses of Matt's code, Matt's code in a UDF, and your looping UDF...

    --===== Create a 10,000 row test table with a PK

    PRINT '===== Test table creation ====='

    SET STATISTICS TIME ON

    SELECT TOP 10000

    IDENTITY(INT,1,1) AS RowNum,

    '0102030405069p9987754613146)' AS FindNumInString

    INTO #jbmTest

    FROM Master.dbo.SysColumns sc1

    CROSS JOIN Master.dbo.SysColumns sc2

    ALTER TABLE #jbmTest

    ADD PRIMARY KEY CLUSTERED (RowNum)

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',100)

    --===== Test Matt Miller's code

    PRINT '===== Matt Miller''s code ====='

    SET STATISTICS TIME ON

    SELECT LEFT(FindNumInString,PATINDEX('%[^0-9]%',FindNumInString)-1)

    FROM #jbmTest

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',100)

    PRINT '===== Matt Miller''s code in a UDF ====='

    SET STATISTICS TIME ON

    SELECT dbo.MillerCode(FindNumInString)

    FROM #jbmTest

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',100)

    --===== Test engintoksoz's function

    PRINT '===== engintoksoz''s function ====='

    SET STATISTICS TIME ON

    SELECT dbo.RTRIMNONNUMERICS(FindNumInString)

    FROM #jbmTest

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',100)

    --===== Cleanup for reruns

    DROP TABLE #jbmTest

    ... and that gives us the following results...

    ===== Test table creation =====

    SQL Server Execution Times:

    CPU time = 32 ms, elapsed time = 45 ms.

    (10000 row(s) affected)

    SQL Server parse and compile time:

    CPU time = 4 ms, elapsed time = 4 ms.

    SQL Server Execution Times:

    CPU time = 32 ms, elapsed time = 208 ms.

    SQL Server Execution Times:

    CPU time = 47 ms, elapsed time = 213 ms.

    ====================================================================================================

    ===== Matt Miller's code =====

    (10000 row(s) affected)

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 363 ms.

    ====================================================================================================

    ===== Matt Miller's code in a UDF =====

    (10000 row(s) affected)

    SQL Server Execution Times:

    CPU time = 1000 ms, elapsed time = 2365 ms.

    ====================================================================================================

    ===== engintoksoz's function =====

    (10000 row(s) affected)

    SQL Server Execution Times:

    CPU time = 23000 ms, elapsed time = 51545 ms.

    ====================================================================================================

    So let's do a little analysis... first, you said that it was likely the fact that you used a UDF and that would make it slow. True, using a UDF does have a little overhead... but that's not why I said your code would be slow. Take a look at the times between "Matt Miller's code in a UDF" and "engintoksoz's function" in the results above. Both are UDF's, but the PatIndex code blows the WHILE loop away. And, if you accept the fact that Matt's code doesn't even need a UDF as shown as "Matt Miller's code" in the results above, you have two reasons not to use a WHILE loop... Matt's code doesn't need a While loop which also helps avoid the need for a UDF.

    As you can see, it's not the UDF that makes such a big difference and that, despite it's internal loop, PatIndex blows a While loop away. Any other questions about why While loops are bad or how I "measured the response on these two approaches"? 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • engintoksoz (12/11/2008)


    ... and learned how to use patindex.

    As a sidebar, what's the first thing you do when you buy a new car? You find out how to operate the controls like turning on the head lights, how to operate the wipers, where the emergency brake is and how to operate it, and how to operate the heater/AC, etc, etc.

    The very first thing you should do when learning any computer language be it SQL or whatever, is to learn to operate the controls. That means that one of the very first things you should do is take a couple of hours to find out what all the functions of the language are, how they operate, and particularlly in SQL, how fast they are. It's the only way to avoid things like the difference between 0 CPU ms and 23 whole seconds for a mere 10,000 row run. Imagine if it were a million rows...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Well done Jeff. Thanks.

  • engintoksoz (12/12/2008)


    Well done Jeff. Thanks.

    Thanks for the feedback, engintoksoz. 🙂 If you have any doubts about any of this, please don't hesitate to ask... I'll see if I can whip up another example if there's something else.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply