There Must Be 15 Ways To Lose Your Cursors… Part 2: Just Put It in a S

Question

Post reply

There Must Be 15 Ways To Lose Your Cursors… Part 2: Just Put It in a S

Viewing 15 posts - 271 through 285 (of 316 total)

You must be logged in to reply to this topic. Login to reply

ganotedp SSC-Addicted Points: 492 More actions · Answer 1

"Business Logic"?!? That's a highly, um, "elastic" concept. I'm with Alex Papadimoulis on the topic:

http://thedailywtf.com/Articles/The-Mythical-Business-Layer.aspx

Mike C SSC-Insane Points: 23224 More actions · Answer 2

kjc-1057161 (7/20/2010)
Btw, the bits I've trimmed out from first time around are to do with when the interest calculated is < 1 - in this case, the precision is kept a bit more, rather than having no interest at all added...
I'm sure this method has saved literally pounds over time.. and after all, everyone knows that developer time comes for free! 😉

That's very odd that you calculate compound interest in such a way to give customers a nice little break on interest. Must be one of the nicer financial institutions around. I'd have to play around with this one a bit to convert it because of the nonstandard calculation involved.

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 3

kjc-1057161 (7/20/2010)
Thanks for both responses! The problem isn't the calculation of compound interest, but the rounding issue, hence all the floor() functions. Each day, the interest is calculated only in whole pence (rounded down). I need not add that this is inherited code!
I've trimmed some of the extraneous bits to leave just the basics - I've also added an @AmountSimple variable to calculate normal compound interest along side, so you can see the difference it makes. If this was what was required, I'm sure I could rewrite with a mathematical calculation (thanks for the wikipedia link!), but I'm pretty sure the rounding throws this out.
declare
@DailyRate float
,@Amount money
,@AmountSimplemoney
,@Period int
,@Slice bit
,@DailyInterest float
,@SimpleInterest float
select @Amount = 1000 * 100 --give amount in pence, so that floor can be used
select @AmountSimple = @Amount
select @slice = 0
select @Period = 500
select @DailyRate = 0.003 / 100 --fairly sure this is converting % to decimal
while (@Period > 0)
begin
select
@Period = @Period - 1
,@DailyInterest = @Amount * @DailyRate
--for simple interest, use daily interest
--for our calculation, round the interest DOWN to the nearest penny
select @SimpleInterest = @DailyInterest
select @DailyInterest = floor(@DailyInterest)
--increment both amounts by the relevant interest
select @Amount = @Amount + @DailyInterest
select @AmountSimple = @AmountSimple + @SimpleInterest
end
--round both amounts down and present in pounds
select @Amount = floor(@Amount) / 100
select @AmountSimple = floor(@AmountSimple) / 100
select @Amount, @AmountSimple

No loops are needed for such a calculation... just a Tally table...

DECLARE @DailyRate FLOAT,

@Amount MONEY,

@AmountSimple MONEY,

@Period INT,

@slice BIT,

@DailyInterest FLOAT,

@SimpleInterest FLOAT

SELECT @Amount = 1000 * 100, --give amount in pence, so that floor can be used

@AmountSimple = @Amount,

@slice = 0,

@Period = 500,

@DailyRate = 0.003 / 100 --fairly sure this is converting % to decimal

SELECT @Period = @Period - 1,

@DailyInterest = @Amount * @DailyRate,

-- for simple interest, use daily interest

-- for our calculation, round the interest DOWN to the nearest penny

@SimpleInterest = @DailyInterest,

@DailyInterest = floor(@DailyInterest),

-- increment both amounts by the relevant interest

@Amount = @Amount + @DailyInterest,

@AmountSimple = @AmountSimple + @SimpleInterest

FROM dbo.Tally t

WHERE t.N BETWEEN 1 AND @Period

--round both amounts down and present in pounds

SELECT @Amount = floor(@Amount) / 100,

@AmountSimple = floor(@AmountSimple) / 100

SELECT @Amount, @AmountSimple

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Mike C SSC-Insane Points: 23224 More actions · Answer 4

I'll have to look a little closer at that one tomorrow. Is multirow variable aggregation as unsafe as multirow string concatenation?

kjc-1057161 SSC Enthusiast Points: 192 More actions · Answer 5

Thanks Jeff! I've read your Tally tables articles before, but was having a complete blind spot as to how it applied here to this problem I'll run some checks on the outputs there, then I'll have to do some work on how it fits into the rest of the procedure (looping through accounts, and looping through periods), but this has been an enormous help!

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 6

Mike C (7/20/2010)
I'll have to look a little closer at that one tomorrow. Is multirow variable aggregation as unsafe as multirow string concatenation?

Gosh, Mike, I'm not sure why people think that. I understand the "slowness" of multirow string concatenation but "unsafe"? Same goes for multirow variable aggregation. I'm just not sure why people feel that way. I've seen people intentionally break such things in the "quirky update" (for example) but not for straight forward calculations not hell bent on intentional breaks.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 7

kjc-1057161 (7/21/2010)
Thanks Jeff! I've read your Tally tables articles before, but was having a complete blind spot as to how it applied here to this problem I'll run some checks on the outputs there, then I'll have to do some work on how it fits into the rest of the procedure (looping through accounts, and looping through periods), but this has been an enormous help!

Thanks kjc... you might want to read the article on "Running Totals", as well. It may help you avoid more loops in a very high speed fashion. It's a long article for a short subject with lot's of proofs. The important part is, if you don't follow the rules exactly, you WILL get incorrect answers. If you can't follow the rules for some reason, then you'll probably need to stick to some form of loop. Here's the link...

http://www.sqlservercentral.com/articles/T-SQL/68467/

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Mike C SSC-Insane Points: 23224 More actions · Answer 8

Jeff Moden (7/21/2010)
Mike C (7/20/2010)
I'll have to look a little closer at that one tomorrow. Is multirow variable aggregation as unsafe as multirow string concatenation?
Gosh, Mike, I'm not sure why people think that. I understand the "slowness" of multirow string concatenation but "unsafe"? Same goes for multirow variable aggregation. I'm just not sure why people feel that way. I've seen people intentionally break such things in the "quirky update" (for example) but not for straight forward calculations not hell bent on intentional breaks.

Well... I've seen multirow string concatenations return completely incorrect results before based on simple things (like table indexes) at the whim of the optimizer. No need to intentionally try to break it, just an honest index change on a table to try to optimize for a particular query and suddenly the results are completely wrong. I believe Itzik posted a very simple example a while ago.

The main problem I have with the quirky update and multirow variable aggs is when the calculation depends on specific ordering, since ordering just isn't guaranteed... especially (but not limited to) when we throw parallel processing into the mix. That's why I wanted to look at your example a little deeper, to see if it depends on ordering of inputs to generate the correct output. It may not, in which case it may be perfectly safe without throwing itself on the mercy of the optimizer 🙂

In all fairness, in your article on Running Totals you list several rules just to keep the quirky update from breaking; it seems you have to be hell bent on avoiding unintentional breaks to make it work correctly 🙂 Check out Itzik's article (I'll see if I can find a link), you may think another new rule is in order 🙂

Mike C SSC-Insane Points: 23224 More actions · Answer 9

Jeff Moden (7/21/2010)
Mike C (7/20/2010)
I'll have to look a little closer at that one tomorrow. Is multirow variable aggregation as unsafe as multirow string concatenation?
Gosh, Mike, I'm not sure why people think that. I understand the "slowness" of multirow string concatenation but "unsafe"? Same goes for multirow variable aggregation. I'm just not sure why people feel that way. I've seen people intentionally break such things in the "quirky update" (for example) but not for straight forward calculations not hell bent on intentional breaks.

I took a peek at your code, and it doesn't seem to rely on a specific ordering so I think you're definitely good in that regard 🙂

Here's Itzik's article on string concatenation using multirow variable assignment: http://www.sqlmag.com/article/sql-server/multi-row-variable-assignment-and-order-by.aspx. I actually ran into this issue at a recent client site before running across Itzik's article and finding the root cause.

peter-757102 SSCertifiable Points: 6877 More actions · Answer 10

Mike C (7/21/2010)
Jeff Moden (7/21/2010)
Mike C (7/20/2010)
I'll have to look a little closer at that one tomorrow. Is multirow variable aggregation as unsafe as multirow string concatenation?
Gosh, Mike, I'm not sure why people think that. I understand the "slowness" of multirow string concatenation but "unsafe"? Same goes for multirow variable aggregation. I'm just not sure why people feel that way. I've seen people intentionally break such things in the "quirky update" (for example) but not for straight forward calculations not hell bent on intentional breaks.
I took a peek at your code, and it doesn't seem to rely on a specific ordering so I think you're definitely good in that regard 🙂
Here's Itzik's article on string concatenation using multirow variable assignment: http://www.sqlmag.com/article/sql-server/multi-row-variable-assignment-and-order-by.aspx. I actually ran into this issue at a recent client site before running across Itzik's article and finding the root cause.

That article you linked, does not make a good case as it clearly not applying the technique properly. If I modify the test code to force no parallelism and force the use of the clustered index, his code works as "expected" again.

DECLARE @s-2 AS VARCHAR(MAX);

SET @s-2 = '';

SELECT @s-2 = @s-2 + col2 + ';'

FROM dbo.T1 with ( index(1) )

ORDER BY col1

option( maxdop 1 )

;

SELECT @s-2;

GO

I do share however some of the feelings of relying on undocumented behavior. I make an exception on proven behavior where the code is explicit enough about the way processing needs to take place. Table hints are an example of this, its takes optimizer randomness out of the equation, which in some cases is desired.

And while forcing an index does not necessarily mandate the order in which it is traversed, I have never spotted it being traversed in reverse. Even if this were possible, the order by clause would surely persuade the optimizer to traverse the forced index in the intended way.

This does not mean however, that I be using such techniques wherever I can for the sake of using it. I would use it merely in special cases where there gain of using it warrants the "risk" of it breaking in a later incarnation of SQL server. And to some extend, even documentend behaviour carries this same risk! The difference being you might spot a change in a documented feature sooner as it will be advertised.

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 11

Mike C (7/21/2010)
Jeff Moden (7/21/2010)
Mike C (7/20/2010)
I'll have to look a little closer at that one tomorrow. Is multirow variable aggregation as unsafe as multirow string concatenation?
Gosh, Mike, I'm not sure why people think that. I understand the "slowness" of multirow string concatenation but "unsafe"? Same goes for multirow variable aggregation. I'm just not sure why people feel that way. I've seen people intentionally break such things in the "quirky update" (for example) but not for straight forward calculations not hell bent on intentional breaks.
Well... I've seen multirow string concatenations return completely incorrect results before based on simple things (like table indexes) at the whim of the optimizer. No need to intentionally try to break it, just an honest index change on a table to try to optimize for a particular query and suddenly the results are completely wrong. I believe Itzik posted a very simple example a while ago.
The main problem I have with the quirky update and multirow variable aggs is when the calculation depends on specific ordering, since ordering just isn't guaranteed... especially (but not limited to) when we throw parallel processing into the mix. That's why I wanted to look at your example a little deeper, to see if it depends on ordering of inputs to generate the correct output. It may not, in which case it may be perfectly safe without throwing itself on the mercy of the optimizer 🙂
In all fairness, in your article on Running Totals you list several rules just to keep the quirky update from breaking; it seems you have to be hell bent on avoiding unintentional breaks to make it work correctly 🙂 Check out Itzik's article (I'll see if I can find a link), you may think another new rule is in order 🙂

Yep.... kinda like a cursor or any other code. If you write it wrong, you get incorrect results. 😉 The rules are pretty simple for usage and I did include verification code the combination of which is still faster than a cursor. The key is, if you don't trust it, don't use it. I've been using if for many years and it's never gone haywire. If you follow the rules, it won't.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 12

peter-757102 (7/22/2010)
Mike C (7/21/2010)
Jeff Moden (7/21/2010)
Mike C (7/20/2010)
I'll have to look a little closer at that one tomorrow. Is multirow variable aggregation as unsafe as multirow string concatenation?
Gosh, Mike, I'm not sure why people think that. I understand the "slowness" of multirow string concatenation but "unsafe"? Same goes for multirow variable aggregation. I'm just not sure why people feel that way. I've seen people intentionally break such things in the "quirky update" (for example) but not for straight forward calculations not hell bent on intentional breaks.
I took a peek at your code, and it doesn't seem to rely on a specific ordering so I think you're definitely good in that regard 🙂
Here's Itzik's article on string concatenation using multirow variable assignment: http://www.sqlmag.com/article/sql-server/multi-row-variable-assignment-and-order-by.aspx. I actually ran into this issue at a recent client site before running across Itzik's article and finding the root cause.
That article you linked, does not make a good case as it clearly not applying the technique properly. If I modify the test code to force no parallelism and force the use of the clustered index, his code works as "expected" again.
DECLARE @s-2 AS VARCHAR(MAX);
SET @s-2 = '';
SELECT @s-2 = @s-2 + col2 + ';'
FROM dbo.T1 with ( index(1) )
ORDER BY col1
option( maxdop 1 )
;
SELECT @s-2;
GO
I do share however some of the feelings of relying on undocumented behavior. I make an exception on proven behavior where the code is explicit enough about the way processing needs to take place. Table hints are an example of this, its takes optimizer randomness out of the equation, which in some cases is desired.
And while forcing an index does not necessarily mandate the order in which it is traversed, I have never spotted it being traversed in reverse. Even if this were possible, the order by clause would surely persuade the optimizer to traverse the forced index in the intended way.
This does not mean however, that I be using such techniques wherever I can for the sake of using it. I would use it merely in special cases where there gain of using it warrants the "risk" of it breaking in a later incarnation of SQL server. And to some extend, even documentend behaviour carries this same risk! The difference being you might spot a change in a documented feature sooner as it will be advertised.

Hi Peter,

I've not tried this particular example but I'm pretty sure that you don't actually need the index hint when you use an ORDER BY to get the correct order. The MAXDOP 1 is the important part.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Mike C SSC-Insane Points: 23224 More actions · Answer 13

peter-757102 (7/22/2010)
Mike C (7/21/2010)
Jeff Moden (7/21/2010)
Mike C (7/20/2010)
I'll have to look a little closer at that one tomorrow. Is multirow variable aggregation as unsafe as multirow string concatenation?
Gosh, Mike, I'm not sure why people think that. I understand the "slowness" of multirow string concatenation but "unsafe"? Same goes for multirow variable aggregation. I'm just not sure why people feel that way. I've seen people intentionally break such things in the "quirky update" (for example) but not for straight forward calculations not hell bent on intentional breaks.
I took a peek at your code, and it doesn't seem to rely on a specific ordering so I think you're definitely good in that regard 🙂
Here's Itzik's article on string concatenation using multirow variable assignment: http://www.sqlmag.com/article/sql-server/multi-row-variable-assignment-and-order-by.aspx. I actually ran into this issue at a recent client site before running across Itzik's article and finding the root cause.
That article you linked, does not make a good case as it clearly not applying the technique properly. If I modify the test code to force no parallelism and force the use of the clustered index, his code works as "expected" again.
DECLARE @s-2 AS VARCHAR(MAX);
SET @s-2 = '';
SELECT @s-2 = @s-2 + col2 + ';'
FROM dbo.T1 with ( index(1) )
ORDER BY col1
option( maxdop 1 )
;
SELECT @s-2;
GO

The technique is still relying on the mercy of the optimizer. Of course you could throw maxdop 1 on these quirky updates every time, or you could just set up a dedicated single processor server to handle this one task. You're still relying on the optimizer to choose the clustered index which may not be optimal--and may not even exist! Correct results should not rely on the mercy of the optimizer, or on forcing the optimizer to do sub-optimal work. That's actually the point of not using cursors and loops, so why replace one method of forcing suboptimal performance with another?

I do share however some of the feelings of relying on undocumented behavior. I make an exception on proven behavior where the code is explicit enough about the way processing needs to take place. Table hints are an example of this, its takes optimizer randomness out of the equation, which in some cases is desired.
And while forcing an index does not necessarily mandate the order in which it is traversed, I have never spotted it being traversed in reverse. Even if this were possible, the order by clause would surely persuade the optimizer to traverse the forced index in the intended way.

Persuade is the key word here. Conditions today may cause the expected result to be returned; however, a change in conditions tomorrow could dramatically change the plan costing and affect the result. Plan cost should only affect the efficiency of retrieving the results, it shouldn't change the result itself.

This does not mean however, that I be using such techniques wherever I can for the sake of using it. I would use it merely in special cases where there gain of using it warrants the "risk" of it breaking in a later incarnation of SQL server. And to some extend, even documentend behaviour carries this same risk! The difference being you might spot a change in a documented feature sooner as it will be advertised.

In risk management there is always some level of risk. Your risk of getting eaten by sharks is much lower if you don't swim in the shark tank.

peter-757102 SSCertifiable Points: 6877 More actions · Answer 14

Jeff Moden (7/22/2010)
Peter,
I've not tried this particular example but I'm pretty sure that you don't actually need the index hint when you use an ORDER BY to get the correct order. The MAXDOP 1 is the important part.

Check the link Jeff, you see that it is required to make it function as expected as the data is retrieved from a non-clustered covering index instead of the clustered key. In itself not a problem, where it not that the concatenation is processed BEFORE the order by takes effect, resulting in just one tiny string instead of a concatenation of all strings.

peter-757102 SSCertifiable Points: 6877 More actions · Answer 15

Mike C,

I understand you point, I hope you can see mine too. In fact I never used the querky update anywhere as I never needed it. But more demanding cases might and as such I don't write it off, it just does not have my preference.

Read the following and you will realize that querky update it the last thing you should worry about!

Recently I "discovered" some pretty insane behavior in what seems like perfectly fine code. This ** might ** not be new to you, but I was really surprised/annoyed. It was a case where expressions in the select clause were executed before all conditions in a matching where clause were met.

I always used to assume there where stages of processing, where filtering was first performed to reduce the size of the working set to a minimum as this makes perfect sense. It turns out this is not always happening and operations in the select clause can error on data that functionally should never get processed due to filtering in the where clause. In such cases you need to write protection code (case when ... else ... end) in the select clause, duplicating the relevant where conditions from the where clause!!! Derrived tables and other fancy constructions will not work and you never know beforehand when such a case turns up!

Something as simple as:

select

left( 'abc', T1.v )

from

T1

where

T1.v >= 0

;

Might go wrong when T1.v can be a negative number and you should therefore write the query as:

select

case when T1.v < 0 then null else left( 'abc', T1.v ) end

from

T1

where

T1.v >= 0

;

I find this sort of thing many more times disturbing/dangerous then using a quirky update deliberatly and with full knowledge and documentation. Why? Because I suspect hardly anyone will know they have to protect their expressions to such a silly extend and the world therefore must be full of non-working code. In fact I expect that ** nearly all ** string processing code in use today to be sensitive to this problem. I knew expression order in the where clause sometimes must be enforced, but never that this problem extended beyond the where clause and into the select clause. Nor have I ever read of this problem anywhere!

I find this a total mess and hope it is just an SQL Server 2008 implementation bug that can be fixed and not standard behavior. Because if it is standard, I think the people responsible for setting the standard are in dire need of some practical re-education. And I express myself quite mildly here.