February 2, 2006 at 6:16 pm
Here, here! I second the thought!
--Jeff Moden
Change is inevitable... Change for the better is not.
August 2, 2006 at 8:18 am
I've been using this technique to allow multiple values to be passed to an SP for use within an IN search condition and it works really nicely.
However I'm a bit curious about the performance. I've been tuning a few queries and viewing the trace in query analyser and the step needed to get the values into a temp table seems to be taking up quite a high proportion of the query cost (nearly 20% of the whole procedure). Considering that this only involves doing a bit of string manipulation and putting up to 10 values into a temporary table, this seems a little high.
This can be recreated using the code below. Is this cost just part of the process of creating the temporary table?
I'm mainly surprised by the number of reads that are shown in the trace window. Are these disk reads or cache reads?
1. Create UDF used to convert delimited string to table
CREATE FUNCTION dbo.SplitInt
(
@List varchar(8000),
@SplitOn nvarchar(5)
)
RETURNS @RtnValue table
(
Id int identity(1,1),
Value int
)
AS
-- Summary: Returns table with single "Value" column, containing integer values from a delimited string of integer values
BEGIN
WHILE (Charindex(@SplitOn,@List)>0)
BEGIN
Insert Into @RtnValue (value)
Select
Value = ltrim(rtrim(Substring(@List,1,Charindex(@SplitOn,@List)-1)))
Set @List = Substring(@List,Charindex(@SplitOn,@List)+len(@SplitOn),len(@List))
END
Insert Into @RtnValue (Value)
Select Value = ltrim(rtrim(@List))
Return
END
2. Run the following script:
-- Adding delimited values to another temporary table with primary key
DECLARE @tblValues TABLE
(
Value int NOT NULL PRIMARY KEY CLUSTERED
)
INSERT INTO @tblValues (Value)
SELECT DISTINCT Value FROM dbo.SplitInt('1,2,3,4,5,6,7',',')
-- Selecting delimited valuesdirectly from function
SELECT Value FROM dbo.SplitInt('1,2,3,4,5,6,7',',')
August 2, 2006 at 8:29 am
Repost :
Set based is faster that procedural :
IF Object_id('fnSplit_Set') > 0
DROP FUNCTION dbo.fnSplit_Set
GO
IF Object_id('Numbers') > 0
DROP TABLE dbo.Numbers
GO
CREATE TABLE dbo.Numbers (PkNumber int identity(1,1), dude bit null, CONSTRAINT Pk_Number PRIMARY KEY CLUSTERED (PkNumber))
GO
INSERT INTO dbo.Numbers (dude)
SELECT NULL FROM (SELECT TOP 100 NULL AS A FROM master.dbo.spt_values) dt100 cross join (SELECT TOP 80 null AS A FROM master.dbo.spt_values) dt80
GO
ALTER TABLE dbo.Numbers
DROP COLUMN dude
GO
--Select min(PkNumber) as MinA, max(PkNumber) as MaxA from dbo.Numbers
--1, 8000
GO
CREATE FUNCTION [dbo].[fnSplit_Set] (@IDS as varchar(8000), @vcDelimiter varchar(3))
RETURNS TABLE
WITH SCHEMABINDING
AS
Return
Select dtSplitted.EachID, dtSplitted.Rank from (
SELECT SUBSTRING(@vcDelimiter + @IDs + @vcDelimiter, N.PkNumber + len(@vcDelimiter),
CHARINDEX(@vcDelimiter, @vcDelimiter + @IDs + @vcDelimiter, N.PkNumber + len(@vcDelimiter)) - N.PkNumber - len(@vcDelimiter)) as EachID
, (LEN(SUBSTRING(@IDs, 1, N.PkNumber)) - LEN (REPLACE (SUBSTRING(@IDs, 1, N.PkNumber), ',', ''))) + 1 AS Rank
FROM dbo.Numbers N
WHERE SUBSTRING(@vcDelimiter + @IDs + @vcDelimiter, N.PkNumber, len(@vcDelimiter)) = @vcDelimiter
AND PkNumber < LEN(@vcDelimiter + @IDs + @vcDelimiter)
) dtSplitted where len(dtSplitted.EachID) > 0
GO
DECLARE @Ids as varchar(8000)
SET @IDS = ''
Select @IDS = @IDS + CAST(id as varchar(10)) + ',' from (Select TOP 10 id from dbo.SysObjects order by NEWID()) dtIds
SET @IDS = left(@IDS, ABS(len(@IDS) - 1))
PRINT @IDS
This is what your queries will look like using the set split function:
Select O.id, dtIDS.EachId, O.name, dtIDS.Rank from dbo.SysObjects O inner join (Select CAST(EachID as INT) AS EachID, Rank from dbo.fnSplit_Set (@IDS, ',')) dtIDS on O.id = dtIDS.EachID order by O.Name
Select O.id, dtIDS.EachId, O.name, dtIDS.Rank from dbo.SysObjects O inner join (Select CAST(EachID as INT) AS EachID, Rank from dbo.fnSplit_Set (@IDS, ',')) dtIDS on O.id = dtIDS.EachID order by dtIDS.Rank
--Dont't ever forget to cast the eachid column to the same datatype as in the join to avoid table scans.
January 28, 2008 at 10:53 am
Hi
I would like to know how can I read a record from a table with 12 columns (each on for each month in a year) end put the value for each mounth into an another table using this table variable?
With arrays I would:
select
arr_value[1] = vt.jan_value,
arr_value[2] = vt.feb_value,
arr_value[3] = vt.mar_value, ...
arr_value[12] = vt.dec_value
from value_table vt
And then I would make a WHILE to put each array position in a record of that another table.
I can't see a flexible solution whith Table-variables like I would have with arrays.
Someone?
Thanks,
Bonk
January 28, 2008 at 1:59 pm
I would like to know how can I read a record from a table with 12 columns (each on for each month in a year) end put the value for each mounth into an another table using this table variable?
This won't be quite the answer you expected, but typically, you WOULDN'T/SHOULDN'T make a table that has values for each month in different columns... kinda defies the whole idea of a relational database.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 28, 2008 at 2:07 pm
If it helps you at all bonk - a table IS a persistent 2-dimension array. One that happens to have the ability to perform parallel operations on its members at the rate of oh - several million at a time. One you don't have to keep populating and tearing down, etc....
Using a loop on a SQL table is like buying a Mercedes for its ashtray: an awful waste of resources, and isn't good for your health....
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
January 29, 2008 at 10:30 am
Hi Jeff!
I agree with you! But there's nothing to do in this case. I explain: the table with one column for each month already exists and I can't modify its structure.
I'm working in a BI (Businness Inteligence) Project and I have to import data from diferent databases to my star-schema modelling database (better for BI projects) and in this kind of project, you can find all sort of stuffs. 🙁
I'm a system victim... 🙂
See you,
Bonk
January 30, 2008 at 5:59 am
bonk (1/29/2008)
Hi Jeff!I agree with you! But there's nothing to do in this case. I explain: the table with one column for each month already exists and I can't modify its structure.
I'm working in a BI (Businness Inteligence) Project and I have to import data from diferent databases to my star-schema modelling database (better for BI projects) and in this kind of project, you can find all sort of stuffs. 🙁
I'm a system victim... 🙂
See you,
Bonk
In that case, we'd need a wee bit more information... you said you wanted to read a 12 month record from one table and put it in another... you'd need to post the CREATE statement from each table and possibly some data that we could play with to make sure we get it right. Take a look at the following URL before you start that to get the best results...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
--Jeff Moden
Change is inevitable... Change for the better is not.
February 14, 2008 at 11:42 pm
Regarding the overall article's topic:
We DO NOT NEED ARRAYS!
To use XML and Table Variables to try to "emulate" arrays is like grabbing your neighbor's Ferrari to try and emulate a Toyota. Very WRONG thinking!
(Note: I use XML or delimited lists myself in similar manners: viz. to WORK AROUND SQL Server's pathetic lack of support for true table variables.)
Please go back and read *anything* about the Relational Model and why it came about. Relations (tables) are vastly superior to arrays and are meant to be a *logical* structure. (That is, independent of how they might be implemented under the hood for performance reasons.) It was meant to give the minimum complexity necessary while providing all the necessary expressive power. ARRAYs do not add any power, only complexity.
The only possible reason to want an ARRAY type is because of assumptions of how the data will be *physically* organized. But we should have simply TABLES (including in-memory tables, table variables, etc.) and have some control over physical-level implementation concerns like whether it's -- under the hood -- sorted in a particular way. Those physical details do not impact the logical ways we interact with the table--which is in fact one of the points!
What WE DO NEED is better support for actual relation values, and relation variables. (That is "table values" and "table variables.") We should be able to define arbitrarily complex tables, pass them between client and server, state literal table values (as whole things) inline in code, etc. etc. etc. Just like we can with an INT or a FLOAT. Then we have more POWER than arrays can give, and we have a consistent model across the board instead of introducing yet another type with its own structures--that is more LIMITED.
Please explain this to any of the ignorante who keep clamoring for array types!
(And please push MS and others to implement better table-variable support. Even SQL 2008's new table variable type is woefully short of the goal.)
February 15, 2008 at 2:56 am
I've taken the two procedure examples you supplied and the stored procedure with the cursor runs faster than your example using the table variable. I had to change the tables to HumanResources.Employee in AdventureWorks. To process just over 250 rows it takes twice as long (406 ms) using the table variable method.
In fact, if you increase the number of rows to 10,000 it takes 15 seconds using the table variable method as opposed to 2 seconds using the cursor. That's a no brainer to me, stick with the cursor (in this particular case anyway).
I'd carefully think twice about using table variables in this manner.
February 15, 2008 at 3:51 am
Very nice article
Now I know how to solve my problem : "SELECT * FROM table WHERE column IN @var". I should have test a lot of thing without finding the solution 😀
thanks for the idea
February 15, 2008 at 4:28 am
The example coding in the SQL Cursor SP is incorrect as it displays the last record twice. This is really bad programming in that the fetch status should be checked immediately after it is retrieved in order to exit the loop:
declare @OK bit
-- Get first row
open mycursor /*get row values*/
set @OK = 1;
while @OK = 1
BEGIN
/*get row values*/
FETCH MyCursor
INTO @CompName, @ContName
if @@fetch_status <> 0
set @OK=0
else
begin
/*perform operations with single row*/
PRINT 'T_SQL cursor row | ' + @ContName + ' | ' + @CompName
end
END
February 15, 2008 at 12:18 pm
This article is over 2 1/2 years old, and it generated some controversy the first time around. So why is it a "featured article" in today's newsletter?
Steve, do you guys need some writers?
Signature is NULL
February 15, 2008 at 12:28 pm
I tried the same comparisons with a real-world example, a data table with over 200,000 rows. In the first iteration the cursor method took 15 seconds, while I killed the 'array' method after 15 minutes after only a small fraction of the rows had been processed. The next time around I added a PRIMARY KEY constraint to the identity field and this did the trick. The time for the array method dropped to 8 seconds. Kind of a key omission, I'd say...
February 15, 2008 at 12:56 pm
By the way, regarding an earlier comment that you cannot Index table variables...
1) True, technically, and another example of the failure to separate logical and physical issues. (Tables should behave as Tables, regardless...)
2) False, in practice. You can't "create index" on one. But you *can* create additional UNIQUE constraints on the table-variable at declaration time. This is implemented as an index, and queries (in my tests) will actually use them just like on a real table.
And any UNIQUE constraint that *also* includes the PK, is guaranteed to be unique. Just put the "thing you want to index" first in the list, so it's more selective. (The extra PK part ends up being wasteful, but oh well.)
I was able to drop a significantly complex scenario from about 10-seconds to about 3-seconds by placing two additional UQ constraints on the table-variable.
Viewing 15 posts - 31 through 45 (of 56 total)
You must be logged in to reply to this topic. Login to reply