Funny calculation and grouping problem

  • Hi Ninja,

    I wonder if the conversion is outside of the aggregate, would perf improves?

    'avg'=sum(Value)/cast(COUNT(*) as float)

    ...ignoring sum() int overflow scenario.

  • Run a quick test on my dev-server (1proc 4cores 8GB, SSD)

    a+0.0 : 1700ms

    a*1.0 : 1606ms

    and

    declare @r decimal(38,20)

    select @r = SUM(a)*1.0 -- cast(COUNT(*) as float)

    from remi

    : 733ms

    --overflow coverage

    declare @r decimal(38,20)

    select @r = SUM(cast(a as float))

    from remi

    : 982ms

  • Flexdog (8/2/2011)


    Hi Ninja,

    I wonder if the conversion is outside of the aggregate, would perf improves?

    'avg'=sum(Value)/cast(COUNT(*) as float)

    ...ignoring sum() int overflow scenario.

    On my machine, that runs about 3 times faster on Remi's test data.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ninja's_RGR'us (8/2/2011)


    Jeff Moden (8/2/2011)[hr

    I haven't actually done a performance test but I know enough about machine language at the cycle level to say that it will always take fewer cycles to do an ADD than it will to do a MULTIPLY. I suspect that it'll take millions of rows to really make a difference, though.

    Wrong on my machine. I gave this theory all the chance in the world but it just fails every test. Multiply is the winner on my prod server (under no load ATM).

    Wrong on my machine, as well. I should have tested it before opening my yapper. And, no... I can't explain it unless it knows to "short circuit" when multiplying by 1.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (8/3/2011)


    Flexdog (8/2/2011)


    Hi Ninja,

    I wonder if the conversion is outside of the aggregate, would perf improves?

    'avg'=sum(Value)/cast(COUNT(*) as float)

    ...ignoring sum() int overflow scenario.

    On my machine, that runs about 3 times faster on Remi's test data.

    Of course it'll run faster if you convert only once VS 10 000 000 times. That theory needs little proof to be accepted as obvious truth ;-).

  • Jeff Moden (8/3/2011)


    Ninja's_RGR'us (8/2/2011)


    Jeff Moden (8/2/2011)[hr

    I haven't actually done a performance test but I know enough about machine language at the cycle level to say that it will always take fewer cycles to do an ADD than it will to do a MULTIPLY. I suspect that it'll take millions of rows to really make a difference, though.

    Wrong on my machine. I gave this theory all the chance in the world but it just fails every test. Multiply is the winner on my prod server (under no load ATM).

    Wrong on my machine, as well. I should have tested it before opening my yapper. And, no... I can't explain it unless it knows to "short circuit" when multiplying by 1.

    Yap away... always fun to test new stuff! 😀

  • I just had a very quick glance at the xml plan for both operations and I can't see any difference.

    Will leave that here for now. Other things to test ATM ;-).

  • Ninja's_RGR'us (8/3/2011)


    Jeff Moden (8/3/2011)


    Flexdog (8/2/2011)


    Hi Ninja,

    I wonder if the conversion is outside of the aggregate, would perf improves?

    'avg'=sum(Value)/cast(COUNT(*) as float)

    ...ignoring sum() int overflow scenario.

    On my machine, that runs about 3 times faster on Remi's test data.

    Of course it'll run faster if you convert only once VS 10 000 000 times. That theory needs little proof to be accepted as obvious truth ;-).

    Understood... I was just confirming the order of magnitude because such considerations are important to folks that might not know such things... especially since I didn't remember to do it. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • And I was lazily replying to 2 comments at the same time. 😀

    I knew YOU know that Jeff, but that truth is easy to miss like the forest for the trees!

  • This was removed by the editor as SPAM

Viewing 10 posts - 16 through 24 (of 24 total)

You must be logged in to reply to this topic. Login to reply