November 13, 2009 at 1:16 am
Jeff Moden (11/12/2009)
Ok, Martin... I'm confused... I see that all of your code examples update the "RunningBalance" column... none of the code updates the "RunningTotal" column yet your complaint is that the "RunningTotal" column goes astray according to your thoughts. What is it that updates the "RunningTotal" column?And, by the way, it's the "RunningTotal" column that went astray... not the "RunningBalance" column that (according to your code), that the "Quirky Update" correctly calculated. 😉
Jeff I am not sure what you mean here?
This is the code I posted and am complaining about. No where does it mention runningbalance.
The running balance column already existed in my table and if its wrong there are going to be some very unhappy customers out there.
SET NOCOUNT ON
--===== Declare the working variables
DECLARE @PrevAccountID INT
DECLARE @AccountRunningTotal DECIMAL(14, 3)
--
--===== Update the running total for this row using the "Quirky Update" -- and a "Pseudo-cursor"
UPDATE
dbo.Ledger
SET
@AccountRunningTotal = RunningTotal =
CASE WHEN AccountID = @PrevAccountID THEN CASE WHEN @AccountRunningTotal > 0
THEN @AccountRunningTotal + (Credit - Debit)
ELSE @AccountRunningTotal - (Debit - Credit) END
ELSE (Credit - Debit)
END,
@PrevAccountID = AccountID
FROM
dbo.Ledger WITH (TABLOCKX)
WHERE
LedgerType = 'Account'
OPTION
(MAXDOP 1)
November 13, 2009 at 5:58 am
Dang it... you're right, Martin. Dyslexia due to lack of coffee must have kicked kicked in. My apologies. I'll check it out.
Or maybe it's the edit you did yesterday? I clearly remember it being as I stated. Either way, though, I'll check it out.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 13, 2009 at 7:12 am
Jeff Moden (11/13/2009)
Dang it... you're right, Martin. Dyslexia due to lack of coffee must have kicked kicked in. My apologies. I'll check it out.Or maybe it's the edit you did yesterday? I clearly remember it being as I stated. Either way, though, I'll check it out.
Sorry Jeff the edit was just spelling! To many late nights me thinks 🙂
November 13, 2009 at 11:17 am
Jeff,
Terrific article. I haven't used Quirky Update for a few years, and that's a good thing because I didn't know all of the rules.
I don't want to stir up a controversy so I'll word this carefully. I have never noticed a temp table or table variable that didn't return its data in the same order that it was added. Do I just need to pay more attention, or is this another Undocumented Feature;-).
Thanks for 22 pages of useful information.
November 13, 2009 at 12:36 pm
Tom Garth (11/13/2009)
Jeff,I don't want to stir up a controversy so I'll word this carefully. I have never noticed a temp table or table variable that didn't return its data in the same order that it was added.
You can modify the following example to use a temporary table and see for yourself:
November 13, 2009 at 1:12 pm
Tom Garth (11/13/2009)
Jeff,Terrific article. I haven't used Quirky Update for a few years, and that's a good thing because I didn't know all of the rules.
I don't want to stir up a controversy so I'll word this carefully. I have never noticed a temp table or table variable that didn't return its data in the same order that it was added. Do I just need to pay more attention, or is this another Undocumented Feature;-).
Thanks for 22 pages of useful information.
Thanks Tom. And Alexander is correct. When it comes to SELECT's, you really need to use ORDER BY to guarantee the order.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 13, 2009 at 11:31 pm
Make your vioce heard
Vote on Connect here
http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=254388
N 56°04'39.16"
E 12°55'05.25"
November 13, 2009 at 11:31 pm
Make your voice heard
Vote on Connect here
http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=254388
N 56°04'39.16"
E 12°55'05.25"
November 14, 2009 at 5:07 am
Joe Celko (11/13/2009)
I have been on the road speaking at PASS and SQL Connections for two weeks. One of the topics in both places was getting a lot of play is finishing the OVER() clause to include the RANGE sub-clause. I am offering to help MS with this as an outside consultant as part of my follow-up. The syntax is a bit verbose, but readable:[ROWS | RANGE <range expression> says how many rows before and after the current row to apply the aggregate function. The keywords are:
BETWEEN
PRECEDING
FOLLOWING
CURRENT
UNBOUNDED | number of rows
so the classic running totals is
SUM (something_amt)
OVER (PARTITION BY floob_grps
ORDER BY something_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
Since it is one optimized statement, it will be faster than cursors for sure, and probably beat even "diseased mutant SQL" proprietary extentions.
It looks a bit complex, but the idea me likes 🙂
Similarly I would like to see something like this possible into constraints as well (vertical constraints if you will). Setting check constraints relative to a preceding row is something quite often wanted and the knowledge encapsulated in it can be used by the optimizer to do neat optimizations during selects. It also would prevent trigger logic containing multiple statements in quite a few situations and make the modeling more declarative (the rules are more transparent then with triggers for sure).
November 14, 2009 at 6:22 am
Really interesting discussion guys. I do have one complaint about Hugo's alternative method, though.
Here in the relevant bit from Hugo's code.
INSERT INTO @Results(AccountID, Date, TransactionDetailID, Amount,
RunningTotal, Rnk)
SELECT AccountID, Date, TransactionDetailID, Amount, Amount,
RANK() OVER (PARTITION BY AccountID
ORDER BY Date,
TransactionDetailID)
FROM dbo.TransactionDetail;
[/code]
The insert into the temp table uses the Rank() function, which the while loop compares against an incrementing counter. Assuming that we can guarantee a unique combination of Date-TransactionDetailID within each AccountID partition, this works as expected. Indeed, the data is such in this proof-of-concept example that there are no problems.
When this is translated to a real world situation, though, I would strongly encourage ROW_NUMBER() over RANK() for the following reason:
Remarks
If two or more rows tie for a rank, each tied rows receives the same rank. For example, if the two top salespeople have the same SalesYTD value, they are both ranked one. The salesperson with the next highest SalesYTD is ranked number three, because there are two rows that are ranked higher. Therefore, the RANK function does not always return consecutive integers.
Hugo's method depends on unique, consecutive integers, and it will fail in the event that an implementation uses a non-unique order by clause. ROW_NUMBER() solves this problem with a minimum of fuss.
November 14, 2009 at 7:28 am
sqlservercentralitgeekry (11/14/2009)
Hugo's method depends on unique, consecutive integers, and it will fail in the event that an implementation uses a non-unique order by clause. ROW_NUMBER() solves this problem with a minimum of fuss.
The order in which numbers are assigned to rows by ROW_NUMBER is not deterministic if there are ties, so it's hard to see how that helps at all.
Paul
November 14, 2009 at 12:16 pm
Joe Celko (11/14/2009)
I have never noticed a temp table or table variable that didn't return its data in the same order that it was added. Do I just need to pay more attention, or is this another Undocumented Feature;-).Actually, you need to read that standards. Were you around when GROUP BY stopped doing a sort and ORDER BY became mandatory? Or when BIT became a numeric data type and was suddenly NULL-able? Suddenly semi-colons have become "best practices" and not "Celko's way" for SQL Server, etc .
A professional always writes to Standards so the code will port to other SQLs and to the future releases of the same SQL product. I make a lot of my money from "Cowboy coders" who boxed in their clients with proprietary stuff. When I got home, I re-connected with Dave Winters, an old friend from Georgia Tech decades ago. He wrote a column on SQL Server after we left GT and consults. Same story -- Cowboy Coders and dialect = regular employment fixing their messes afterward.
Please, get off the "Code only to SQL Standards Soap Box."
I work in a MS SQL Server shop and I will write the code to make the best and most scalable and "performant" code I can. If that means using proprietary extensions, then that is what I'll do and is what is expected of me by my employer.
If we should move to another DBMS, we'll cross that bridge when it occurs and rewrite code as is needed. I don't see that happening, however, any time soon as we are experiencing budgetary issues and ORACLE is just not going to be in our future. I also don't see us going to an open source DBMS either.
November 14, 2009 at 1:22 pm
Joe Celko (11/13/2009)
I have been on the road speaking at PASS and SQL Connections for two weeks. One of the topics in both places was getting a lot of play is finishing the OVER() clause to include the RANGE sub-clause. I am offering to help MS with this as an outside consultant as part of my follow-up. The syntax is a bit verbose, but readable:[ROWS | RANGE <range expression> says how many rows before and after the current row to apply the aggregate function. The keywords are:
BETWEEN
PRECEDING
FOLLOWING
CURRENT
UNBOUNDED | number of rows
so the classic running totals is
SUM (something_amt)
OVER (PARTITION BY floob_grps
ORDER BY something_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
Since it is one optimized statement, it will be faster than cursors for sure, and probably beat even "diseased mutant SQL" proprietary extentions.
Interesting. On which RDBMS did you benchmark and concluded that "one optimized statement, it will be faster than cursors"?
In general this is simply not true - cursors may be and sometimes are implemented faster than OLAP functions. Some time ago I spent considerable time optimizing such problems on Oracle 9, and on Oracle 9 cursors were several times faster than OLAP functions. We need to be very careful making blanket statements...
November 14, 2009 at 1:26 pm
Hi Joe
Joe Celko (11/14/2009)
A professional always writes to Standards so the code will port to other SQLs and to the future releases of the same SQL product.
In my opinion, that's not correct. I'd say: "A professional always get things work."
I know many people who tried only to use the standard SQL features which are (hopefully) supported in every DBMS. They hit the wall as soon as the load increases. A wise man once told me "Buying a DBMS like SQL Server (or Oracle or PostgreSQL or ...) to use only the standard features is like buying a scientific calculator and just use plus and minus" ;-).
Greets
Flo
November 14, 2009 at 2:06 pm
Similarly I would like to see something like this possible into constraints as well (vertical constraints if you will). Setting check constraints relative to a preceding row is something quite often wanted and the knowledge encapsulated in it can be used by the optimizer to do neat optimizations during selects. It also would prevent trigger logic containing multiple statements in quite a few situations and make the modeling more declarative (the rules are more transparent then with triggers for sure).
Same here - sometimes there are business rules such as "do not change running totals more than 20% at a time". For that, I use denormalization and CHECK constraints:
Viewing 15 posts - 61 through 75 (of 307 total)
You must be logged in to reply to this topic. Login to reply