Testing that a varchar could be converted to a uniqueidentifier

  • David Betteridge (7/4/2012)


    just tried another clr version which multiple functions which return different data types (as below). and used Jeff's test harness (attached).

    there was no clear winner!

    I suspect that the C# code runs very fast for this and that the reason why they came out so even is because they are so fast compared to the cost of invoking them.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Gullimeel (7/4/2012)


    In most all cases (index or not), the simple "Jeff" function held it's own against the CLR (even after removing the "bit" penalty where it finally passed about half the time) and the "Gullimeel" function didn't catch up and then pass until more than 50% of the data was non-GUID data.

    I already mentioned that my method wil be better when most of the data is non guid..

    I used the same number of rows that Gullimeel used (121,317) for each test. I'm not sure where the "47ms" time came from in his tests unless he had some massive parallelism going on. I'm limited to only 2 - 2.8GHz processors on my laptop for testing. It's also why I like to see complete test harnesses so that folks don't have to try to recreate what actually happened.

    Your machine is much powerful than mine for 100% GUID data I was getting 2059ms for your method and some 3400ms for mine wheras in your case it is taking much less.

    One reason why you did not see 47ms is because you are using the dbcc freeproccache and I mentioned that I ran the test two times and used the values for 2nd run.

    Also, these timings sometime are not perfect because there are other things running on machines. But this is just to get an idea on when to use which ,off course after doing through testing.

    I brought up the fact that yours works better if the non-GUIDs are predominate just because it was a result of the testing and I was confirming what you said.

    I also (this morning) used the original test you built and did manage to get the 47ms (mine came out at 491 for the same thing). It was pretty late and didn't take out the DBCC FREPROCCACHE command. I should have given that a try last night.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (7/4/2012)

    I suspect that the C# code runs very fast for this and that the reason why they came out so even is because they are so fast compared to the cost of invoking them.

    I think you might be right, I've created my clr code as a user defined aggregate and it reduced the time of my method from 700ms to 400ms (ish)

    CREATE AGGREGATE CountMatchingGUIDS(@input nvarchar(4000))

    RETURNS int

    EXTERNAL NAME IsGUID.GUIDsAgg;

    GO

    of course it doesn't meet the original requirements of this post. 🙂

  • Instead of using a function to test the ability to convert to a uniqueidentifier before actually converting it, wouldn't it be faster to do this inline with a case statement and either return the string converted to a uniqueidentifier or null as the output?

    select

    a.*,

    MyUID =

    case

    when a.MyStrGuid like

    '[0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F]-'+

    '[0-9A-F][0-9A-F][0-9A-F][0-9A-F]-'+

    '[0-9A-F][0-9A-F][0-9A-F][0-9A-F]-'+

    '[0-9A-F][0-9A-F][0-9A-F][0-9A-F]-'+

    '[0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F]'+

    '[0-9A-F][0-9A-F][0-9A-F][0-9A-F]'

    then convert(uniqueidentifier,a.MyStrGuid)

    else null end

    from

    ( -- Test Data

    select MyStrGuid = '72E28A38-8A55-4F54-AE61-F870BA58E9FD' union all

    select MyStrGuid = 'F7E9803C-75E6-4ED7-915C-564710B8EA16' union all

    select MyStrGuid = 'AFAE2032-8677-4E8F-9921-B783BEA8C433' union all

    select MyStrGuid = '72E28A38-8A554F54-AE61-F870BA58E9FD' union all

    select MyStrGuid = 'F7E9803C-75E6-4ED7-915C=564710B8EA16' union all

    select MyStrGuid = 'AFAE2032-8677-4E8F-9921-G783BEA8C433' union all

    select MyStrGuid = 'AFAE2032-8677-4E8-9921-B783BEA8C433' union all

    select MyStrGuid = ''union all

    select MyStrGuid = null

    ) a

    Results:

    MyStrGuid MyUID

    ------------------------------------ ------------------------------------

    72E28A38-8A55-4F54-AE61-F870BA58E9FD 72E28A38-8A55-4F54-AE61-F870BA58E9FD

    F7E9803C-75E6-4ED7-915C-564710B8EA16 F7E9803C-75E6-4ED7-915C-564710B8EA16

    AFAE2032-8677-4E8F-9921-B783BEA8C433 AFAE2032-8677-4E8F-9921-B783BEA8C433

    72E28A38-8A554F54-AE61-F870BA58E9FD NULL

    F7E9803C-75E6-4ED7-915C=564710B8EA16 NULL

    AFAE2032-8677-4E8F-9921-G783BEA8C433 NULL

    AFAE2032-8677-4E8-9921-B783BEA8C433 NULL

    NULL

    NULL NULL

    (9 row(s) affected)

  • Michael Valentine Jones (7/4/2012)


    Instead of using a function to test the ability to convert to a uniqueidentifier before actually converting it, wouldn't it be faster to do this inline with a case statement and either return the string converted to a uniqueidentifier or null as the output?

    Use the test harness I created to find out. I suspect there won't be much (if any) time difference. You also lose the idea of easily reusable code by not having a function to do this but, like you, would easily give that up for such a simple thing if a change in performance made it worth it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (7/5/2012)


    Michael Valentine Jones (7/4/2012)


    Instead of using a function to test the ability to convert to a uniqueidentifier before actually converting it, wouldn't it be faster to do this inline with a case statement and either return the string converted to a uniqueidentifier or null as the output?

    Use the test harness I created to find out. I suspect there won't be much (if any) time difference. You also lose the idea of easily reusable code by not having a function to do this but, like you, would easily give that up for such a simple thing if a change in performance made it worth it.

    I'm not sure it is really an apples to apples test, since all the functions are meant to test if a string can be converted to a uniqueidentifier, while my code actually does the conversion directly, and is meant to be incorporated directly into the select statement in the the stored procedure posted by the OP.

    Note that I would also "fix" the stored proc to convert the input parameter to a uniqueidentifier instead of converting the primary key of the table to a varchar. That would probably have far more positive impact on performance than the code of the various functions, since the current version of the proc is almost certainly causing a table scan instead of a primary key index lookup.

    SELECT

    a.*

    FROM

    T1 a

    WHERE

    a.u_id =

    case

    when @id like

    '[0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F]-'+

    '[0-9A-F][0-9A-F][0-9A-F][0-9A-F]-'+

    '[0-9A-F][0-9A-F][0-9A-F][0-9A-F]-'+

    '[0-9A-F][0-9A-F][0-9A-F][0-9A-F]-'+

    '[0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F]'+

    '[0-9A-F][0-9A-F][0-9A-F][0-9A-F]'

    then convert(uniqueidentifier,@id )

    else null end

Viewing 6 posts - 31 through 35 (of 35 total)

You must be logged in to reply to this topic. Login to reply