Intresting Query

  • Hand coding the SQLCLR doesn't help much either, nowhere near Dwains 'REVERSE'

    ================================================================================

    DWAIN

    ================================================================================

    SQL Server Execution Times:

    CPU time = 998 ms, elapsed time = 998 ms.

    ================================================================================

    MARK'S SQLCLR

    ================================================================================

    SQL Server Execution Times:

    CPU time = 2169 ms, elapsed time = 2184 ms.

    ================================================================================

    MARK'S IMPROVED SQLCLR

    ================================================================================

    SQL Server Execution Times:

    CPU time = 1825 ms, elapsed time = 1825 ms.

    using System;

    using System.Data.SqlTypes;

    using System.Text;

    public partial class UserDefinedFunctions

    {

    [Microsoft.SqlServer.Server.SqlFunction]

    public static SqlString StripZeros(SqlString s)

    {

    if (s.IsNull)

    {

    return SqlString.Null;

    }

    String val = s.Value;

    int startPos = 0;

    while (startPos < val.Length && val[startPos] == '0') startPos++;

    int endPos = val.Length - 1;

    while (endPos >= startPos && val[endPos] == '0') endPos--;

    return (SqlString)(val.Substring(startPos, endPos - startPos + 1));

    }

    };

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • I guess to answer OP given question any of T-SQL solution would do, I also think that interviewer will be looking for answer in SQL. I wouldn't go with CLR for this while on interview. However, I wouldn't try to discuss performance aspects there as well. Based on given 6 records? Who would care?

    If you want to come with more real life cases... It will all depend.

    Yes, if all of strings can be converted to integer or bigint, than, probably T-SQL, would do good enough.

    But, in most general case this sort of task can be handled by CLR with better performance and flexibility.

    Let's have a look my example.

    Let's say that the strings are larger than 20 varchar, and they have more than one trailing zero...

    In T-SQL only REPLACE method (Mark's one) would work, as trying to convert to number will fail as soon as significant portion of string is larger than 38 characters.

    Here is my CLR (please note proper Regex expression for removing leading and trailing zeros):

    static readonly Regex _regex = new Regex(@"^[0]+|[0]+$",RegexOptions.Compiled);

    [Microsoft.SqlServer.Server.SqlFunction]

    public static SqlString TrimLTZr(SqlString val)

    {

    // Put your code here

    if (val.IsNull) return SqlString.Null;

    return _regex.Replace(val.ToString(), String.Empty);

    }

    The test table setup (let's add more railing zeros :-)):

    IF object_id('tempdb..#sampleData') IS NOT NULL

    BEGIN

    DROP TABLE #sampleData;

    END;

    --1,000,000 Random rows of data

    SELECT TOP 1000000 IDENTITY(INT,1,1) AS [Sno],

    '0000' + REPLICATE( CAST((ABS(CHECKSUM(NEWID())) )+ 1 AS VARCHAR(500)),RAND() * 10 ) + '00000000000000' AS [Column]

    INTO #sampleData

    FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3;

    Only Mark's and CLR tests as ones with converting to numeric will not work:

    DECLARE @HOLDER VARCHAR(500)

    PRINT REPLICATE('=',80);

    PRINT 'MARK';

    PRINT REPLICATE('=',80);

    SET STATISTICS TIME ON;

    SELECT @HOLDER = REPLACE(LTRIM(RTRIM(REPLACE([Column] COLLATE Latin1_General_BIN2,'0',' '))),' ','0')

    FROM #sampleData

    ;

    SET STATISTICS TIME OFF;

    PRINT REPLICATE('=',80);

    PRINT 'CLR my';

    PRINT REPLICATE('=',80);

    SET STATISTICS TIME ON;

    SELECT @HOLDER = [dbo].[TrimLTZr]([Column])

    FROM #sampleData

    SET STATISTICS TIME OFF;

    And that what I have as results:

    ================================================================================

    MARK

    ================================================================================

    SQL Server Execution Times:

    CPU time = 3984 ms, elapsed time = 3997 ms.

    ================================================================================

    CLR my

    ================================================================================

    SQL Server Execution Times:

    CPU time = 3313 ms, elapsed time = 3455 ms.

    Is it always like that? No, it's not. It depends of what kind of values generated in a sample tables. But from execution on 10 different sample sets, only once, Mark version performed around 100ms faster.

    Actually, I've generated 1000000 of identical records as '00000000010000000000'

    and in this case both methods did show very similar time.

    Please note, when declaring CLR function in SQL, you can vary the length of the string input parameter and string return value. They can be set to NVARCHAR(MAX), but that will perform much slower than limiting it to the one you really need eg. NVARCHAR(20) or NVARCHAR(550)...

    And, sorry, for typing in capital, I didn't intend to shout, I just wanted to make the sentence appear bold, as it's quite important tip.

    And for Dwain,

    I would do SELECT 0+REPLACE... instead of SELECT 1*REPLACE...

    it will take less number of CPU cycles to calculate...

    Actually, I'm not sure, Isn't explicit casting to INT performs a bit faster than implicit one caused by calculation.

    :hehe:

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Eugene Elutin (8/10/2012)


    And, sorry, for typing in capital, I didn't intend to shout, I just wanted to make the sentence appear bold, as it's quite important tip.

    Apology accepted, I may have became offended too easily.

    Eugene Elutin (8/10/2012)


    And for Dwain,

    I would do SELECT 0+REPLACE... instead of SELECT 1*REPLACE...

    it will take less number of CPU cycles to calculate...

    Actually, I'm not sure, Isn't explicit casting to INT performs a bit faster than implicit one caused by calculation.

    :hehe:

    I believe that the explicit cast and the implicit cast are in this case equivalent.

    [/quote]


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Cadavre (8/10/2012)


    Eugene Elutin (8/10/2012)


    And, sorry, for typing in capital, I didn't intend to shout, I just wanted to make the sentence appear bold, as it's quite important tip.

    Apology accepted, I may have became offended too easily.

    Eugene Elutin (8/10/2012)


    And for Dwain,

    I would do SELECT 0+REPLACE... instead of SELECT 1*REPLACE...

    it will take less number of CPU cycles to calculate...

    Actually, I'm not sure, Isn't explicit casting to INT performs a bit faster than implicit one caused by calculation.

    :hehe:

    I believe that the explicit cast and the implicit cast are in this case equivalent.

    Try it.... you're in for a surprise. It's not a whole lot in this case (about 2%) but it's still an improvement if you use the integer conversion route.

    PRINT REPLICATE('=',80);

    PRINT 'Implicit Cast';

    PRINT REPLICATE('=',80);

    SET STATISTICS TIME ON;

    SELECT @HOLDER = REVERSE(REVERSE([Column]+0)+0)

    FROM #sampleData;

    SET STATISTICS TIME OFF;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Eugene Elutin (8/10/2012)


    However, I wouldn't try to discuss performance aspects there as well. Based on given 6 records? Who would care?

    Funny you should ask. πŸ˜€

    If someone wrote the same code as Mark in an interview with me, just imagine my joy if someone responded to the next question of "Why did you write it this way" with "Although a doubly reversed implicit conversion to INT would be slightly faster, it's not as scalable whereas this way we never have to worry about what we throw at it and it still gives great performance rivaling even a CLR over a much larger row set".

    Right after I found my jaw, my next question would be "When can you start"? πŸ˜‰

    Instead, I was trying to find DBAs and Developers that even know the basics (we did find a good one, though). My favorite question is becoming "Discuss the differences between a Clustered and Non-Clustered Index". 95% of the time I'm told "The Clustered index is always the PK and..."

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Wow! I'm glad I took the weekend off. Gave you guys a chance to get it all sorted out.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Jeff Moden (8/10/2012)


    ...My favorite question is becoming "Discuss the differences between a Clustered and Non-Clustered Index". 95% of the time I'm told "The Clustered index is always the PK and..."

    Heh - we used this a couple of gigs ago and it works a treat. Also "Discuss what might happen if a column from a left-joined table is referenced in the WHERE clause".

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 7 posts - 31 through 36 (of 36 total)

You must be logged in to reply to this topic. Login to reply