O(n) , O(n log n), O(n^2)

  • karthikeyan (6/9/2009)


    1) SQL Server apparently (based on my tests) already does the "buffer-extension" trick available to mutable strings. Unfortunately, the Extension trick does NOT solve the O(n^2) problem,

    If it supports the "buffer-extension" trick,then why it doesn't solve O(n^2) problem. again , i need to know exactly about O(n^2).

    Because, as I already explained, it can only be used part of the time.

    can you explain it with example? so that i can remember it through my career.

    I already did this above, here it is again:

    The "buffer-extension trick" is referring to a low-level (Assembler or C) trick that takes advantage of that fact that in most OS's strings often have large gaps of unallocated memory after them. Thus if you want to add the string B$ onto the end of A$, then normally you would have to allocate a new buffer whose length is >= Len(A$) + Len(B$), then copy all of A$ inot it, then copy B$ in after A$ and then re-point A$ to the new buff (and deallocate the old). Besides all of the memory allocation overhead, this is an O(n) operation where n = Len(A$)+Len(B$).

    However, if A$ just happens to have an unused portion of memory after it whose size is >= Len(B$) then, instead you can just copy B$ into that unused memory after A$ and extend A$'s length value to inlcude the appended characters.

    Please read fully what we have taken the time to write and stop making us repeat ourselves.

    Does this issue solved in sql2008?

    The "buffer-extension trick" can never solve this problem. It is a trick that can only be used if you are not already using the memory that is immediately above your output string. There is no way that an OS or DB environment can ever guarantee this all of the time. The only true solution is to use a different algorithm.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • So we should avoid pseudo cursor. Right?

    I never said anything like this.

    1) It will be treated as internal cursor. i.e it will perform looping internally.

    AFAIK all set processing in SQL Server uses internal cursors. And as long as CPU-cores execute single streams of instructions, all servers, programs and OS's will have to loop internally.

    But comparing to external cursor, it will be very fasy. Right?

    Yes.

    2) is it a good habit to use pseudo cursor in sql programming? if not, what is the workaround for this?

    It is a technique. It is not the most desirable technique, but neither is it entirely undesirable. And sometimes it is the best technique available for a problem.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Declare @STR varchar(5)

    select @STR = ''

    select @STR = @STR + no from mystring

    step 1: '' + '1'

    step 2: '1' + '2'

    step 3: '12' + '3'

    step 4: '123' + '4'

    step 5: '1234' + '5'

    so if i apply ((N^2) + N)/2 formula

    N = 5

    ...

    5^2 = 5*5 = 25

    25+5 = 30/2 = 15. Right?

    O(n) operation where n = Len(A$)+Len(B$).

    so '1' + '2' is an O(n) operation.

    O(n^2) problem is

    1) Time to concatenate the string

    2) Space claimed by the variables

    3) Resource taken to do the above two operation

    karthik

  • Karthik, '1'+'2' is still O(n^2). What's 1 squared?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • RBarryYoung (6/8/2009)


    karthikeyan (6/8/2009)


    4) What LHS (Left-hand side) function denotes exactly?

    Left-hand Functions appear left of the assignment operator ("="):

    Set @STR = UPPER('Some text.')

    These are normal functions, and AFAIK, in T-SQL, all functions are LHS.

    3) What RHS (right-hand side) function denotes exactly?

    Right-hand Side functions appear on the right-hand side and they usually do "special" things having to do with addressing the output property or variable. AFAIK, T-SQL does not have any, but in some languages, STUFF is a RHS:

    STUFF(@str, offset, len) = 'foo'

    This example would overwrite the output string (@str) with the input string starting at 'offset' for 'len' characters. The difference between this and th LHS STUFF() function in T-SQL is that the RHS version does not return anything, it actually does write over the characters of the @STR variable.

    3) The "pre-allocate and Stuff" trick popular with mutable strings is not workable in T-SQL because the STUFF() function in T_SQL is NOT like the function of the same name in some general purpose languages: the T-SQL STUFF() is an RHS (right-hand side) function and NOT an LHS (left-hand side) function. AFAIK, there is no function in SQL that can (physically) write into a pre-existing string.

    RBarryyoung,

    The above two points are confusing me...

    you mean to say TSQL STUFF() function can't write into a pre-existing string.

    karthik

  • He's talking about memory allocation. I don't know how T-SQL handles variable buffers and such, but in C, you would actually rewrite the memory addresses (or at least, you can). That's different from what Stuff does, per his post. (Can't say for sure. Don't know enough about that level of the database engine.)

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • karthikeyan (6/9/2009)


    RBarryYoung (6/8/2009)


    3) The "pre-allocate and Stuff" trick popular with mutable strings is not workable in T-SQL because the STUFF() function in T_SQL is NOT like the function of the same name in some general purpose languages: the T-SQL STUFF() is an RHS (right-hand side) function and NOT an LHS (left-hand side) function. AFAIK, there is no function in SQL that can (physically) write into a pre-existing string.

    RBarryyoung,

    The above two points are confusing me...

    you mean to say TSQL STUFF() function can't write into a pre-existing string.

    Correct. Except for the TEXT-mod commands (UPDATETEXT, etc.) SQL Server does not have mutable strings for in-memory values, which simply means that technically, you can never "modify" a string in-place, you always have to make a new copy of it. (note: do not try to apply this to on-disk values, like columns in tables, because the terms don't mean the same thing there).

    Note that this is in apparent contradiction to the BOL entry for STUFF() which says: "Deletes a specified length of characters and inserts another set of characters at a specified starting point." However, STUFF() does nothing of the sort as demonstrated by this script:

    Declare @a varchar(20), @b-2 varchar(20), @C varchar(20)

    Select @a='', @b-2='123456789', @C='xyz'

    Select @a as [@a], @b-2 as [@b], @C as [@c]

    Set @a = STUFF(@b, 5, 3, @C)

    Select @a as [@a], @b-2 as [@b], @C as [@c]

    You will note that the @b-2 string is unchanged by STUFF().

    Now if BOL had prefaced that statement so that it read "Returns a string by copying the source string and then Deletes a specified length of characters and inserts another set of characters at a specified starting point." it would be correct, but as it currently stands, BOL is wrong.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • karthikeyan (6/9/2009)


    RBarryYoung (6/8/2009)


    3) The "pre-allocate and Stuff" trick popular with mutable strings is not workable in T-SQL because the STUFF() function in T_SQL is NOT like the function of the same name in some general purpose languages: the T-SQL STUFF() is an RHS (right-hand side) function and NOT an LHS (left-hand side) function. AFAIK, there is no function in SQL that can (physically) write into a pre-existing string.

    Oops, I just realized that I have a typo here. I reversed LHS and RHS in this statement. I will go back and fix it...

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Oh wait, thats not from here is it? That's from my comment on Phil Factor's blog some months ago, I must have mis-copied the headers when using the QUOTE button here.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Now I understood your point #2 and #3. But still #1 is not clear for me...

    {(n)*(n+1)/2} cost of the naive implementation to {n +(n)*(n-1)/(2*k)}

    Again, if you could explain your formula's for my below example, i think it will be very useful for me as well as for others too.

    Declare @STR varchar(5)

    select @STR = ''

    select @STR = @STR + no from mystring

    step 1: '' + '1'

    step 2: '1' + '2'

    step 3: '12' + '3'

    step 4: '123' + '4'

    step 5: '1234' + '5'

    output of O(n) ?

    output of O(n^2). It is 15.

    output of {(n)*(n+1)/2}. Again 5*(5+1)/2 = 5*3 = 15

    output of {n +(n)*(n-1)/(2*k)}.

    If K= 1,

    =5+(5)*(5-1)/(2*1)

    =5+5*2

    =5+10

    =15

    {(n)*(n+1)/2} cost of the naive implementation to {n +(n)*(n-1)/(2*k)}

    how?

    {(n)*(n+1)/2} = 15

    {n +(n)*(n-1)/(2*k)} = 15

    There is no difference.

    If K= 0,

    =5+(5)*(5-1)/(2*0)

    "Divide By Zero error occured".

    How should i determine 'k' value? on what basis i have to determine the value?

    but I could get it down to O(n*Log(n)) time which is still a huge improvement

    Is it denotes FOR XML option?

    or

    Is it denotes Pseudo cursor method?

    Can you explain it?

    karthik

  • Karthik,

    At this point I'd suggest stopping. Unless you are involved in theoretical research, you really don't need to figure out how to solve Big O mathematical formulas, just understand the O(n) algorithms are better than O(n2) algorithms. Know when to recognisize them, for instance a nested loop using cursors versus a set-based cross join for example.

    Plus, you still haven't answered Gail's original questions.

  • karthikeyan (6/9/2009)


    output of O(n) ?

    output of O(n^2). It is 15.

    They aren't formulas to have values plugged into and results calculated. They are descriptions of the time-complexity of algorithms.

    To say that O(n^2) is 15 is a meaningless statement (plus 5^2 is not 15. It's 25)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • karthikeyan (6/9/2009)


    {(n)*(n+1)/2} cost of the naive implementation to {n +(n)*(n-1)/(2*k)}

    how?

    {(n)*(n+1)/2} = 15

    {n +(n)*(n-1)/(2*k)} = 15

    There is no difference.

    If K= 0,

    =5+(5)*(5-1)/(2*0)

    "Divide By Zero error occured".

    Right. the problem is that you are quoting my comments on Phil Factor's blog and I could't edit my typos there. Here is a more correct formula:

    k*n + (1-k)*(n2 - n)/2

    Note that this is still only an approximation because we are waving away the difference between the number of strings and the possibly different lengths of those strings.

    How should i determine 'k' value? on what basis i have to determine the value?

    I doubt that you could. AFAIK, I am the only person who has ever claimed that K was anything other than 0 (though if I am correct, then there must be folks inside Microsoft who know it also).

    but I could get it down to O(n*Log(n)) time which is still a huge improvement

    Is it denotes FOR XML option?

    or

    Is it denotes Pseudo cursor method?

    Neither. The naive pseudocursor method is O(n2). The FOR XML method is O(n). I was referring to the very complex modified pseudocursor method that I posted at Phil Factor's blog. It is O(n*log(n)).

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • O(wtf2) 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (6/9/2009)


    O(wtf2) 😉

    Heh. Hey, I'm not the one who asked the question, I'm just trying to answer them.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

Viewing 15 posts - 16 through 30 (of 41 total)

You must be logged in to reply to this topic. Login to reply