October 2, 2009 at 9:38 am
peter-757102 (10/2/2009)
Jeff Moden (10/2/2009)
peter-757102 (10/2/2009)
Jeff Moden (10/2/2009)
That's why I shifted to All_Columns for future code. Rumor has it that syscolumns is going away and Master.sys.All_Columns always has at least 4k rows on a full install. Also, what is "system_columns"? I get an invalid object error when I try to do a simple count of that from Master.For the same reason tally stuff should be encapsulated within a function whenever possible. Modifying the function to use another source of records is then all that is needed if the existing source becomes unavailable.
Or a passthrough view. Or, better yet, a real Tally table that you can cross-join if you need something bigger.
Not sure what you are pointing out here Jeff. My point was that when you wrap whatever record source in an inline function you can do all that as the function shell can be optimised away in most code. Even with a dedicated numbers table I think a function should be used to warp access to that table.
Warp being the ultimate word there. If you have a static Tally table why on earth would you wrap a function round it.
Would you want to wrap a function around every single table in your DB?
[font="Arial"]_______________________________________________________
Change is inevitable... Except from a vending machine.[/font]
October 2, 2009 at 9:54 am
Rob Fisk (10/2/2009)
peter-757102 (10/2/2009)
Jeff Moden (10/2/2009)
peter-757102 (10/2/2009)
Jeff Moden (10/2/2009)
That's why I shifted to All_Columns for future code. Rumor has it that syscolumns is going away and Master.sys.All_Columns always has at least 4k rows on a full install. Also, what is "system_columns"? I get an invalid object error when I try to do a simple count of that from Master.For the same reason tally stuff should be encapsulated within a function whenever possible. Modifying the function to use another source of records is then all that is needed if the existing source becomes unavailable.
Or a passthrough view. Or, better yet, a real Tally table that you can cross-join if you need something bigger.
Not sure what you are pointing out here Jeff. My point was that when you wrap whatever record source in an inline function you can do all that as the function shell can be optimised away in most code. Even with a dedicated numbers table I think a function should be used to warp access to that table.
Warp being the ultimate word there. If you have a static Tally table why on earth would you wrap a function round it.
Would you want to wrap a function around every single table in your DB?
Not at all, it has to do with code reuse and environment changes.
Generating a number range is a function that is generic. Many situations and more specificly other reusable generic functions will depend on it. Sometimes you can use a non-dedicated table for this, or need to generate numbers on the fly due to space constraints. It all depends on your situation, and its no fun to recode all your queries and reusable functions because they have hardcoded access to a certain number source.
October 2, 2009 at 10:19 am
peter-757102 (10/2/2009)
Jeff Moden (10/2/2009)
That's why I shifted to All_Columns for future code. Rumor has it that syscolumns is going away and Master.sys.All_Columns always has at least 4k rows on a full install. Also, what is "system_columns"? I get an invalid object error when I try to do a simple count of that from Master.For the same reason tally stuff should be encapsulated within a function whenever possible. Modifying the function to use another source of records is then all that is needed if the existing source becomes unavailable.
I am not sure that this is good advice, as most functions would kill the performance, performance being the big reason that we use Tally tables in the first place. I have found that this is OK, but only if you design it as an In-line table-valued function. If that was what you meant, Peter, that's fine, but it should be explicit as most people who could benefit from this advice will not realize that automatically.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
October 2, 2009 at 10:44 am
RBarryYoung (10/2/2009)
peter-757102 (10/2/2009)
Jeff Moden (10/2/2009)
That's why I shifted to All_Columns for future code. Rumor has it that syscolumns is going away and Master.sys.All_Columns always has at least 4k rows on a full install. Also, what is "system_columns"? I get an invalid object error when I try to do a simple count of that from Master.For the same reason tally stuff should be encapsulated within a function whenever possible. Modifying the function to use another source of records is then all that is needed if the existing source becomes unavailable.
I am not sure that this is good advice, as most functions would kill the performance, performance being the big reason that we use Tally tables in the first place. I have found that this is OK, but only if you design it as an In-line table-valued function. If that was what you meant, Peter, that's fine, but it should be explicit as most people who could benefit from this advice will not realize that automatically.
Actually, I am going to back off on this for the moment. I just remembered a whole bunch of performance wierdness that I saw was last year wrt Tally tables and TVF's. I don't have time to look it up right now though...
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
October 2, 2009 at 10:49 am
And that is why I designed my Dynamic Tally Table as an In-line TVF versus the multi-statement TVF in Frank Solomons' article in SQL Server Magazine.
October 2, 2009 at 11:06 am
RBarryYoung (10/2/2009)
peter-757102 (10/2/2009)
Jeff Moden (10/2/2009)
That's why I shifted to All_Columns for future code. Rumor has it that syscolumns is going away and Master.sys.All_Columns always has at least 4k rows on a full install. Also, what is "system_columns"? I get an invalid object error when I try to do a simple count of that from Master.For the same reason tally stuff should be encapsulated within a function whenever possible. Modifying the function to use another source of records is then all that is needed if the existing source becomes unavailable.
I am not sure that this is good advice, as most functions would kill the performance, performance being the big reason that we use Tally tables in the first place. I have found that this is OK, but only if you design it as an In-line table-valued function. If that was what you meant, Peter, that's fine, but it should be explicit as most people who could benefit from this advice will not realize that automatically.
You are right about that, is there anything else then inline table valued these days? 😉
RBarryYoung (10/2/2009)
RBarryYoung (10/2/2009)
Actually, I am going to back off on this for the moment. I just remembered a whole bunch of performance weirdness that I saw was last year wrt Tally tables and TVF's. I don't have time to look it up right now though...
I think I remember that article that you refer to. Was it the one about string splitting where everyone ended up abandoning the initial TVF constraints and wend in the direction of .NET and "naked" SQL statements in order to get the fastest solution? If so, that was an immense thread indeed and had plenty of SQL weirdness, not per see specific to inline TVF. I admit that because of some of those weird things, not all solutions would be encapsulatable in an inline TVF. But there were good performing solutions that could have been.
With inline TVF as with any other SQL you can run into the issue of the optimizer not finishing its optimization as it sets a limit beforehand on how much effort it will do before beginning execution. Thus you can end up with an half optimized statement that does not perform as you like. It is more of a generic optimizer problem then an inline TVF problem as I see it. We currently lack the control to force the optimizer to do a full optimization when we feel it is needed, and even then the optimizer is not perfect to begin with.
BTW. That particular thread left me with a bad feeling, as we were all reactionary to SQL Server and its sometime odd optimizing decisions. At least I ended up trying loads of weird things, that I would never have conceived without roadblocks on every turn I made. Common sense good code sometimes performed far worse then hard to follow and very restricted complex code. Personally I rather have an all round good solution that is reusable and saves me time optimizing specific queries then making every query a turbo version that is perfectly tuned to a specific machine and SQL version, but can suddenly goes to awful all of a sudden.
People have to read my code in the future too you know 😉 and inline-table-valued functions can certainly help understand what happens in my code without full understanding of what magic is being used in the function itself. For the similar reasons I love common tale expressions as they make code much more readable and more organised.
October 2, 2009 at 11:09 am
peter-757102 (10/2/2009)
Jeff Moden (10/2/2009)
peter-757102 (10/2/2009)
Jeff Moden (10/2/2009)
That's why I shifted to All_Columns for future code. Rumor has it that syscolumns is going away and Master.sys.All_Columns always has at least 4k rows on a full install. Also, what is "system_columns"? I get an invalid object error when I try to do a simple count of that from Master.For the same reason tally stuff should be encapsulated within a function whenever possible. Modifying the function to use another source of records is then all that is needed if the existing source becomes unavailable.
Or a passthrough view. Or, better yet, a real Tally table that you can cross-join if you need something bigger.
Not sure what you are pointing out here Jeff. My point was that when you wrap whatever record source in an inline function you can do all that as the function shell can be optimised away in most code. Even with a dedicated numbers table I think a function should be used to warp access to that table.
It didn't come out that way, but I was actually agreeing with you. Depending on what the company and resident DBA allow, an inline Tally function would certainly be a viable single source that would survive migrations from version to version by allowing a single change in the system instead of having to track down everywhere a certain table was used to provide a row source. If the resident DBA or company specs didn't allow for such a function to be created (right or wrong, many frown on any function usage), a pass through view called "RowSource" (or somesuch) could be created so that all inline code would point to a common row source for such things as on-the-fly Tally generation. That, too, would allow a single change to fix all code during a migration to a future version.
Of course, creating a Tally table called "Tally" would also allow for such a thing. Once created, no changes to any code anywhere for any reason would need to be made. If a row count larger than the Tally table were need, a simple cross join similar to those already posted would solve the problem. The same would hold true for a function that uses a method similar to the one portrayed in this article.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 2, 2009 at 11:37 am
Jeff Moden (10/2/2009)
It didn't come out that way, but I was actually agreeing with you. Depending on what the company and resident DBA allow, an inline Tally function would certainly be a viable single source that would survive migrations from version to version by allowing a single change in the system instead of having to track down everywhere a certain table was used to provide a row source. If the resident DBA or company specs didn't allow for such a function to be created (right or wrong, many frown on any function usage), a pass through view called "RowSource" (or somesuch) could be created so that all inline code would point to a common row source for such things as on-the-fly Tally generation. That, too, would allow a single change to fix all code during a migration to a future version.
Of course, creating a Tally table called "Tally" would also allow for such a thing. Once created, no changes to any code anywhere for any reason would need to be made. If a row count larger than the Tally table were need, a simple cross join similar to those already posted would solve the problem. The same would hold true for a function that uses a method similar to the one portrayed in this article.
I misunderstood you then, indeed a view would go a long way to accomplishing the same goals (a changable row source) and is more conventional and trusted. it is only when parameterization is required that a function becomes more useful.
October 2, 2009 at 11:57 am
I would go for an inline table function - especially if dealing with small ranges much higher up the range (for example checking 12 digit account numbers for sequence breaks). Also I would say the best source of guaranteed rows on 2005+ is sys.all_parameters - that has all the params for master sp's whichever db context you're using it in. I think it's about 8000 rows...
Atlantis Interactive - SQL Server Tools
My blog[/url]
Why I wrote a sql query analyzer clone
October 2, 2009 at 12:28 pm
On my dev system master.sys.all_parameters returns 6776 rows. In the context of the particular database I tested it, sys.all_parameters, returned 6841.
October 2, 2009 at 12:47 pm
Matt Whitfield (10/2/2009)
I would go for an inline table function - especially if dealing with small ranges much higher up the range (for example checking 12 digit account numbers for sequence breaks). Also I would say the best source of guaranteed rows on 2005+ is sys.all_parameters - that has all the params for master sp's whichever db context you're using it in. I think it's about 8000 rows...
There are much more effective methods than using a join to a number source to find gaps in a sequence. I never use a Tally table or function for that type of stuff... it's way too slow.
Thanks for the tip on the sys.all_parmeters table... I'll check it out.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 2, 2009 at 12:52 pm
Jeff Moden (10/2/2009)
There are much more effective methods than using a join to a number source to find gaps in a sequence. I never use a Tally table or function for that type of stuff... it's way too slow.
Could you enlighten me please? 😀
Atlantis Interactive - SQL Server Tools
My blog[/url]
Why I wrote a sql query analyzer clone
October 2, 2009 at 12:57 pm
Lynn Pettis (10/2/2009)
On my dev system master.sys.all_parameters returns 6776 rows. In the context of the particular database I tested it, sys.all_parameters, returned 6841.
Yeah I think that's about on the money actually. The 8000 was from one of my test DBs which had it's fair share of test objects... 😀 I didn't check it out because I was in the garden at the time...
Atlantis Interactive - SQL Server Tools
My blog[/url]
Why I wrote a sql query analyzer clone
October 2, 2009 at 1:02 pm
Sure... most of this code is just to setup the test. The last SQL paragraph is the GAP checker.
DROP TABLE #MyTest
GO
--===== Create and populate a 1,000,000 row test table.
-- Column "MyID" has a range of 1 to 1,000,000 unique numbers starting at 82011000000001
-- Jeff Moden
SELECT TOP 1000000
MyID = IDENTITY(BIGINT,82011000000001,1)
INTO #MyTest
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
--===== A table is not properly formed unless a Primary Key has been assigned
-- Takes about 3 seconds to execute.
ALTER TABLE #MyTest
ADD PRIMARY KEY CLUSTERED (MyID)
--===== Now, let's add a calculated column to display leading zeros in a "field" of 15 characters
-- like the original problem.
ALTER TABLE #MyTest
ADD DisplayMyID AS RIGHT('000000000000000'+CAST(MyID AS VARCHAR(15)),15)
--===== Delete some know rows to demo the gap detection code
-- This deletes 50 rows spaced 2000 apart in the given range
-- to demo small gaps
DELETE #MyTest
WHERE MyID BETWEEN 82011000500001 AND 82011000600000
AND MyID %2000 = 0
-- This deletes 100,000 rows in a given range to demo large gaps
DELETE #MyTest
WHERE MyID BETWEEN 82011000600001 AND 82011000700000
--===== Find the "gap ranges"
-- This assumes that gaps include any whole number greater than 0
SELECT GapStart = (SELECT ISNULL(MAX(b.MyID),0)+1
FROM #MyTest b
WHERE b.MyID < a.MyID),
GapEnd = MyID - 1
FROM #MyTest a
WHERE a.MyID NOT IN (SELECT MyID + 1 FROM #MyTest)
It also doesn't need a mega-row tally table.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 2, 2009 at 1:11 pm
When I was at school there used to be a game called 'Betrayal at Krondor' - where you would meet people as you travelled around. They might have a certain skill, like lockpicking, and you would pay them x amount of whatever-currency-it-was-in-the-game, then you would see a very jerky video, after which a message box would pop up saying 'your lock picking skill has just increased 12 points' - and you'd be all like WTF - but somehow you were better at lock picking.
I feel like you've just said 'your SQL skill has just increased 12 points' - and now I need to go and study that!
Thank you 😀
Atlantis Interactive - SQL Server Tools
My blog[/url]
Why I wrote a sql query analyzer clone
Viewing 15 posts - 106 through 120 (of 159 total)
You must be logged in to reply to this topic. Login to reply