November 14, 2009 at 2:07 pm
"Buying a DBMS like SQL Server (or Oracle or PostgreSQL or ...) to use only the standard features is like buying a scientific calculator and just use plus and minus" ;-).
Greets
Flo
That's a good one!
November 14, 2009 at 8:31 pm
I have to agree with Lynn and Flo on this one. As much as it might be nice to code generically so the code will perform on all DBMS's, it's not feasible. There are many enhancements unique to each that work better than the generic "Standards" code.
Why buy a product and not use the features it includes? Once all products have all the same features, with the exact same standards, then we can revisit. It's not just SQL that has it's own feature sets, Oracle has plenty of them too.
If I were to buy a Porsche with a turbo package (which when turned on, drops the body closer to the ground for better performance and lower center of gravity), but never use it - I just wasted 50K. If I am not using the feature - I am wasting money.
And really, don't we all have to fix crud code anyway - no matter what RDBMS?
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
November 15, 2009 at 10:45 am
CirquedeSQLeil (11/14/2009)
I have to agree with Lynn and Flo on this one. As much as it might be nice to code generically so the code will perform on all DBMS's, it's not feasible. There are many enhancements unique to each that work better than the generic "Standards" code.Why buy a product and not use the features it includes? Once all products have all the same features, with the exact same standards, then we can revisit. It's not just SQL that has it's own feature sets, Oracle has plenty of them too.
If I were to buy a Porsche with a turbo package (which when turned on, drops the body closer to the ground for better performance and lower center of gravity), but never use it - I just wasted 50K. If I am not using the feature - I am wasting money.
And really, don't we all have to fix crud code anyway - no matter what RDBMS?
Well said.
For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]
November 15, 2009 at 11:13 pm
Heh... write a stored procedure that takes just one silly little parameter and you suddenly have code that isn't portable between SQL Server and Oracle. True code portability is a myth.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 16, 2009 at 12:33 am
+1 for Florian's calculator analogy.
N 56°04'39.16"
E 12°55'05.25"
November 16, 2009 at 1:22 am
As I said, "a wise man told me". First person who told me this analogy was Jeff 😉
November 16, 2009 at 1:29 am
A schema-level CREATE ASSERTION would be extremely useful. I'd probably place that higher up the list than any of the new running-total-type ideas presented (including LAG/LEAD). Failing that, extending table-level constraints to work with aggregates, and reference foreign-key-related tables would be nice.
I'm a T-SQL developer, not a lowest-common-denominator SQL developer. 😛
Temporary tables may return data in an order other than in which they were inserted (this works the same with #temporary tables):
DECLARE @T TABLE (row_id BIGINT NULL, padding NCHAR(3000) NULL);
INSERT @T (row_id, padding)
SELECT TOP (10)
row_id = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),
padding = REPLICATE(N'X', 3000)
FROM master.sys.allocation_units A1
OPTION (MAXDOP 1);
DELETE @T
WHERE row_id <= 5;
INSERT @T (row_id, padding)
SELECT TOP (5)
row_id = 10 + ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),
padding = NULL
FROM master.sys.allocation_units A1
OPTION (MAXDOP 1);
-- Rows do *not* come back in insertion order
SELECT *
FROM @T;
November 16, 2009 at 1:30 am
Peso-284236 (11/16/2009)
+1 for Florian's calculator analogy.
Ouch. That's a dreadful pun, even if unintentional!
November 16, 2009 at 2:07 am
At least it is "lowest common" 😀
N 56°04'39.16"
E 12°55'05.25"
November 16, 2009 at 6:08 am
Regarding my temp table without ORDER BY observation:
It was just that. An obversation.
I didn't say that I don't use ORDER BY.
I didn't recommend not using ORDER BY.
I was hoping to see SOME discussion, but in spite of genuine effort, I obvously didn't word it carefully enough.
The scientific calculator analogy was out of the park.
I love this forum!
November 16, 2009 at 3:13 pm
Tom Garth (11/16/2009)
I was hoping to see SOME discussion, but in spite of genuine effort, I obvously didn't word it carefully enough.
Well I tried 😀
November 16, 2009 at 4:34 pm
Paul White (11/14/2009)
sqlservercentralitgeekry (11/14/2009)
Hugo's method depends on unique, consecutive integers, and it will fail in the event that an implementation uses a non-unique order by clause. ROW_NUMBER() solves this problem with a minimum of fuss.The order in which numbers are assigned to rows by ROW_NUMBER is not deterministic if there are ties, so it's hard to see how that helps at all.
In the event of a tie, RANK will produce a non-unique, non-contiguous series, which breaks the running total values from the point of the duplication forward. While it's true that ROW_NUMBER will arbitrarily break a tie (using the physical order of rows in the table I believe), the running total calculations will remain intact across the entire series, including the rows on which the ordering criteria are non-unique. Granted, the non-unique rows may have incorrect running total values, but ROW_NUMBER preserves the integrity of the series as a whole.
One application I can think of immediately is in the algorithms used by [edit: SOME] banks to post transactions. Nightly transactions are posted by Date (without time), then by Amount DESC to maximize overdraft fees. RANK will error in the event of matching amounts, ROW_NUMBER will not. For that application, an arbitrary tie-break is acceptable, as long as an accurate running total value is preserved.
November 16, 2009 at 5:04 pm
sqlservercentralitgeekry (11/16/2009)
Granted, the non-unique rows may have incorrect running total values, but ROW_NUMBER preserves the integrity of the series as a whole.
For that application, an arbitrary tie-break is acceptable, as long as an accurate running total value is preserved.
Those two statements seem contradictory...?
The order in which row numbers are assigned (in the event of a tie) is arbitrary and non-deterministic, and since a table (set) has no concept of inherent order, 'physical order of rows in the table' is meaningless. Parallelism or a hash operator earlier in the plan are examples where the idea falls apart even if we imagine that tables do have some inherent order.
The original statement was "Hugo's method depends on unique, consecutive integers, and it will fail in the event that an implementation uses a non-unique order by clause. ROW_NUMBER() solves this problem with a minimum of fuss.". Hugo's implementation does use a deterministic order by - anything else would be daft. Why introduce this question at all?
Any robust implementation using the ranking functions should normally ensure a deterministic order, and there's really no excuse for not doing that where required. In the hypothetical banking example, one might, for example, break the tie on transaction_id.
Paul
November 16, 2009 at 7:24 pm
Paul White (11/16/2009)
sqlservercentralitgeekry (11/16/2009)
Granted, the non-unique rows may have incorrect running total values, but ROW_NUMBER preserves the integrity of the series as a whole.
For that application, an arbitrary tie-break is acceptable, as long as an accurate running total value is preserved.Those two statements seem contradictory...?
The order in which row numbers are assigned (in the event of a tie) is arbitrary and non-deterministic, and since a table (set) has no concept of inherent order, 'physical order of rows in the table' is meaningless. Parallelism or a hash operator earlier in the plan are examples where the idea falls apart even if we imagine that tables do have some inherent order.
The original statement was "Hugo's method depends on unique, consecutive integers, and it will fail in the event that an implementation uses a non-unique order by clause. ROW_NUMBER() solves this problem with a minimum of fuss.". Hugo's implementation does use a deterministic order by - anything else would be daft. Why introduce this question at all?
Any robust implementation using the ranking functions should normally ensure a deterministic order, and there's really no excuse for not doing that where required. In the hypothetical banking example, one might, for example, break the tie on transaction_id.
Paul
Not that I care to get in the middle of a good brawl, but where is the presumption that a running total MUST be deterministic come from? You can have a running total based on a partial order if the data doesn't lend itself to having a perfectly unique order. Ultimately - other than the grouping levels, any additional order by is arbitrary, just based on what might best fit the data being reported. I'd hate to be the one telling the business unit that the LOGICAL (but not perfect) order they demand isn't the one they're going to get, "just because".
As long as the running totals break according to the levels you determine, and the detail CAN be ordered in such a way that the running total can make mathematical sense - where's the problem?
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
November 16, 2009 at 7:30 pm
Matt Miller (#4) (11/16/2009)[
Not that I care to get in the middle of a good brawl, but where is the presumption that a running total MUST be deterministic come from?
I am not sure that a running total MUST be deterministic, but I have simply never seen otherwise. I worked quite a lot with cases like inventory (how much stuff do we have), or insurance (how much money counts towards the deductible), etc. In every case I have ever worked with, running totals were always deterministic.
Can you give us a real life example when we care about running totals, and they can be non-deterministic?
Viewing 15 posts - 76 through 90 (of 307 total)
You must be logged in to reply to this topic. Login to reply