O(n) , O(n log n), O(n^2)

  • RBarryYoung (6/9/2009)


    Jeff Moden (6/9/2009)


    O(wtf2) 😉

    Heh. Hey, I'm not the one who asked the question, I'm just trying to answer them.

    Oh no... sorry, Barry. Wasn't directed at anyone in particular.

    Shifting gears,

    If someone asked me to define it so that a layman could understand it, I'd tell them the "O" is an abbreviation for "Order of Magnitude" and the the formula it contains is nothing more than the part of another formula that's going to have the biggest effect on things... that's it's just shorthand to make comparing some more complex formulas easier to compare as a swag (order of magnitude).

    For example, the forumula for the number of rows that a triangular join will touch in SQL Server is ((N2+N)/2)+N. If you strip away all the small stuff that won't impact the result much when the numbers get really big, you end up with O(N2/2). That's much easier to compare to a cartesian join which is both N2 and O(n2) to say that a triangular join is approximately half as bad as the square join.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (6/10/2009)


    If you strip away all the small stuff that won't impact the result much when the numbers get really big, you end up with O(N2/2). That's much easier to compare to a cartesian join which is both N2 and O(n2) to say that a triangular join is approximately half as bad as the square join.

    Actually, you strip away the constant factors (like 1/2) too, so that the complexity order is also O(n2). Thus the complexity of all square and all triangular algorithms have the same order.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (6/10/2009)


    Jeff Moden (6/10/2009)


    If you strip away all the small stuff that won't impact the result much when the numbers get really big, you end up with O(N2/2). That's much easier to compare to a cartesian join which is both N2 and O(n2) to say that a triangular join is approximately half as bad as the square join.

    Actually, you strip away the constant factors (like 1/2) too, so that the complexity order is also O(n2). Thus the complexity of all square and all triangular algorithms have the same order.

    Understood and I agree. The point I was really trying to make is that for all values this side of infinity that will still fit inside SQL Server, the /2 is important. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (6/12/2009)


    RBarryYoung (6/10/2009)


    Jeff Moden (6/10/2009)


    If you strip away all the small stuff that won't impact the result much when the numbers get really big, you end up with O(N2/2). That's much easier to compare to a cartesian join which is both N2 and O(n2) to say that a triangular join is approximately half as bad as the square join.

    Actually, you strip away the constant factors (like 1/2) too, so that the complexity order is also O(n2). Thus the complexity of all square and all triangular algorithms have the same order.

    Understood and I agree. The point I was really trying to make is that for all values this side of infinity that will still fit inside SQL Server, the /2 is important. 😛

    Yes, but the point is that you do not use constant factors like the /2 in big-O notation or the "complexity order" of something. This is because the purpose of complexity order analysis is to separate algorithmic efficiency (the converse of complexity), from implementation efficiency which is dependent on HW performance, SW efficiency, and the developer's code efficiency.

    What this means is that an efficiently implemented Square algorithm can beat an inefficient Triangular algorithim, all the way up to infinity. But no matter how efficient the implementation of an O(n2) algorithm like simple string concatenation with pseudocursors is, it will always eventually lose out to an algorithm of lower complexity order no matter how inefficient it is, like FOR XML string concatenation, which is a very inefficient O(n).

    So what big-O notation tells us is that while I might be able to tune a Square algorithim to beat a Triangular one because they are both O(n2), I will never be able to tune a Triangular algorithm (O(n2) )to beat a Linear one (O(n)). Except at the low end where it's less important anyway.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (6/13/2009)


    Jeff Moden (6/12/2009)


    RBarryYoung (6/10/2009)


    Jeff Moden (6/10/2009)


    If you strip away all the small stuff that won't impact the result much when the numbers get really big, you end up with O(N2/2). That's much easier to compare to a cartesian join which is both N2 and O(n2) to say that a triangular join is approximately half as bad as the square join.

    Actually, you strip away the constant factors (like 1/2) too, so that the complexity order is also O(n2). Thus the complexity of all square and all triangular algorithms have the same order.

    Understood and I agree. The point I was really trying to make is that for all values this side of infinity that will still fit inside SQL Server, the /2 is important. 😛

    Yes, but the point is that you do not use constant factors like the /2 in big-O notation or the "complexity order" of something. This is because the purpose of complexity order analysis is to separate algorithmic efficiency (the converse of complexity), from implementation efficiency which is dependent on HW performance, SW efficiency, and the developer's code efficiency.

    What this means is that an efficiently implemented Square algorithm can beat an inefficient Triangular algorithim, all the way up to infinity. But no matter how efficient the implementation of an O(n2) algorithm like simple string concatenation with pseudocursors is, it will always eventually lose out to an algorithm of lower complexity order no matter how inefficient it is, like FOR XML string concatenation, which is a very inefficient O(n).

    So what big-O notation tells us is that while I might be able to tune a Square algorithim to beat a Triangular one because they are both O(n2), I will never be able to tune a Triangular algorithm (O(n2) )to beat a Linear one (O(n)). Except at the low end where it's less important anyway.

    You're absolutely correct and preaching to the choir. I should have just said that it was O(n2) and left it at that because that would have been the correct thing to do.

    Still, a triangular join will beat a square join by a factor of two every time. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (6/13/2009)


    You're absolutely correct and preaching to the choir. I should have just said that it was O(n2) and left it at that because that would have been the correct thing to do.

    Still, a triangular join will beat a square join by a factor of two every time. 😛

    Heh. Thanks, now I have to spend the rest of the day coming up with a count-example? 😛

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Experts,

    is there any solution available for "Mass concatenate" in SQL 2008 OR SQL 2012?

    karthik

  • GilaMonster (6/8/2009)


    karthikeyan (6/8/2009)


    1) O(n) means ?

    2) O(n^2) means ?

    3) O(n log n) means ?

    Surely you covered the Big O notation for algorithmic (time) complexity at university? It's a fundamental of Comp Sci theory.

    Try these

    http://en.wikipedia.org/wiki/Computational_complexity_theory

    http://en.wikipedia.org/wiki/Big_O_notation

    It's not often I disagree with Gail, but here I have to say that those references probably are not useful.

    The first is off into esoteric stuff way beyond the level of a simple primer, going so far as to talk about open problems, and surely anything that far beyond primer level is unlikely to be helpful to someone asking this question. The second gives a formal definition which makes it correct to say that the constant function C1(x) = 1 is O(exp(exp(exp(x)))), and the identity function I(x) = x is O(x^x), which certainly is not going to give someone who wants to know what we mean by it anything like the right idea.

    So I'm going to offer a description for laymen:

    suppose we have an algorithm A which given n rows of input (or count rows of output instead, if that is useful) will on average have to do W(n) units of work and use S(n) amount of store. Then we say W(n) is O(n) if (for big enough n) when n doubles then W(n) roughly doubles. We say W(n) is O(n^2) if (for big enough n) when n doubles W(n) goes up by roughly a factor of 4 (2 = 2^2) and when n goes up by a factor of 10 W(n) goes up roughly by a factor of 100. Work complexity W(n) is O(n) means the amount of work is roughly proportional to n, work complexity W(n) is O(n^2) means the amount of work is roughly proportional to n^2. The same for any other function in the brackets after the big O: if the work complexity is O(n log(n)) then for big enough the amount of work is roughly proportional to n*log(n), if the work complexity is O(2^n) each extra row roughly doubles the amount of work, and so on. The idea of Space complexity S(n) being O(g(n)) for some function g is much the same.

    Obviously a formal mathematical definition is a bit different, but for practical purposes when working with database stuff the description above is what is actually needed. Of course you won't solve any of the currently unresolved mathematical issues in complexity theory with that description, but that's not what (most) database people want to do.

    edit: woops, just noticed that this thread started a very long time ago, and my comments are on a very early post, not on today's post, so may no longer be of any interest.

    Tom

  • karthikeyan-444867 (9/28/2012)


    Experts,

    is there any solution available for "Mass concatenate" in SQL 2008 OR SQL 2012?

    People have already referred to teh "for xml" method, which is O(n) so for larg enough n withh be much better than the other methods discussed (which are all O(n^2)).

    Tom

  • Hello All,

    I just refreshed all my posts.

    This is a very old post (asked on 2009) 🙂

    How to identify whether a code fall under O(n) / O (n2) / O(n log n) by seeing the code ?

    karthik

  • karthik M (3/16/2013)


    Hello All,

    I just refreshed all my posts.

    This is a very old post (asked on 2009) 🙂

    How to identify whether a code fall under O(n) / O (n2) / O(n log n) by seeing the code ?

    Most of us don't try and analyze our code to determine if it is O(n), O(n^2), or O(n log n). We test our code against expect data loads, and then unexpected loads (the million row test) and tune our code appropriately to achieve a scalable solution.

  • karthik M (3/16/2013)


    How to identify whether a code fall under O(n) / O (n2) / O(n log n) by seeing the code ?

    http://en.wikipedia.org/wiki/Analysis_of_algorithms

    Should get you started. The list of references and further reading too.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 12 posts - 31 through 41 (of 41 total)

You must be logged in to reply to this topic. Login to reply