Split function input be column of a table

Question

Split function input be column of a table

Viewing 11 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply

Robert Davis One Orange Chip Points: 28027 More actions · Answer 1

That script doesn't even have 29 lines. If you double click the error message, which line does it jump to in the query pane?

My blog: SQL Soldier[/url]
SQL Server Best Practices: SQL Server Best Practices
Twitter: @SQLSoldier
My book: Pro SQL Server 2008 Mirroring[/url]
Microsoft Certified Master: SQL Server, Data Platform MVP
Database Engineer at BlueMountain Capital Management[/url]

Sql Junkie-204042 Hall of Fame Points: 3874 More actions · Answer 2

There were blank lines between each statement. Removed them

Create Table TableA (ID varchar(2), Nodes varchar(5), LT int)

Create Table TableB (ID int, [Desc] varchar(20), Qty int)

Insert Into TableA Values ('M1', '1;2;3', 4)

Insert Into TableA Values ('M2', '4;5;6', 5)

Insert Into TableB Values (1, 'Desc of 1', 2)

Insert Into TableB Values (2, 'Desc of 2', 13)

Insert Into TableB Values (3, 'Desc of 3', 43)

Insert Into TableB Values (4, 'Desc of 4', 52)

Insert Into TableB Values (5, 'Desc of 5', 66)

Insert Into TableB Values (6, 'Desc of 6', 77)

Select A.ID As TableAID, A.LT, B.ID As TableBID, B.[Desc], B.Qty

From TableA A

Inner Join TableB B On B.ID IN

(Select Element From

dbo.SplitToInt(A.Nodes, ';'))

Now the error is on line 15 which is the last line. If I remove the "dbo." in the function call I still get the error. So I guess the error is in the dot in the red text.

Robert Davis One Orange Chip Points: 28027 More actions · Answer 3

That makes it seem like the error is in the function. Do you get an error if you run this:

Select Element From

dbo.SplitToInt('1;2;3', ';'))

My blog: SQL Soldier[/url]
SQL Server Best Practices: SQL Server Best Practices
Twitter: @SQLSoldier
My book: Pro SQL Server 2008 Mirroring[/url]
Microsoft Certified Master: SQL Server, Data Platform MVP
Database Engineer at BlueMountain Capital Management[/url]

Sql Junkie-204042 Hall of Fame Points: 3874 More actions · Answer 4

There was an extra ) at the end. Removed it and the function runs fine with no error. This function is in production and is called from multiple places.

Jeff Moden SSC Guru Points: 1004392 More actions · Answer 5

... 'course, if you have a Tally table, you don't even need a function (although they are convenient)... included the original data for convenience...

Create Table TableA (ID varchar(2), Nodes varchar(5), LT int)

Create Table TableB (ID int PRIMARY KEY, [Desc] varchar(20), Qty int)

Insert Into TableA Values ('M1', '1;2;3', 4)

Insert Into TableA Values ('M2', '4;5;6', 5)

Insert Into TableB Values (1, 'Desc of 1', 2)

Insert Into TableB Values (2, 'Desc of 2', 13)

Insert Into TableB Values (3, 'Desc of 3', 43)

Insert Into TableB Values (4, 'Desc of 4', 52)

Insert Into TableB Values (5, 'Desc of 5', 66)

Insert Into TableB Values (6, 'Desc of 6', 77)

SELECT a.ID AS TableAID,a.LT,b.ID AS TableBID,b.[Desc],b.QTY FROM dbo.TableB b, ( SELECT ID, LTRIM(SUBSTRING(ta.Nodes, t.N+1, CHARINDEX(';', ta.Nodes, t.N + 1) - t.N - 1)) AS ParsedNode, LT FROM dbo.Tally t, (SELECT ID,';'+Nodes+';' AS Nodes, LT FROM dbo.TableA) ta WHERE SUBSTRING(ta.Nodes, t.N, 1) = ';' AND t.N < LEN(ta.Nodes) ) a WHERE a.ParsedNode = b.ID ORDER BY TableAID,a.LT,TableBID

If you still want to use a split function, suggest you convert it to use the Tally table instead of a while-loop... you can extract the necessary code from above to make the function...

And, if you still don't have a Tally table, now's probably a good time to make one...

--===== Create and populate the Tally table on the fly

SELECT TOP 11000 --equates to more than 30 years of dates

IDENTITY(INT,1,1) AS N

INTO dbo.Tally

FROM Master.dbo.SysColumns sc1,

Master.dbo.SysColumns sc2

--===== Add a Primary Key to maximize performance

ALTER TABLE dbo.Tally

ADD CONSTRAINT PK_Tally_N PRIMARY KEY CLUSTERED (N)

--===== Allow the general public to use it

GRANT SELECT ON dbo.Tally TO PUBLIC

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Robert Davis One Orange Chip Points: 28027 More actions · Answer 6

Seems like a huge waste of time and effort. Not to mention what happens when the ID's reach 11001.

My blog: SQL Soldier[/url]
SQL Server Best Practices: SQL Server Best Practices
Twitter: @SQLSoldier
My book: Pro SQL Server 2008 Mirroring[/url]
Microsoft Certified Master: SQL Server, Data Platform MVP
Database Engineer at BlueMountain Capital Management[/url]

Sql Junkie-204042 Hall of Fame Points: 3874 More actions · Answer 7

Thanks Jeff. Works great. Didn't have a Tally table.

Too late in the day to look at this query in detail now.

Thanks again.

Robert Davis One Orange Chip Points: 28027 More actions · Answer 8

>> why would you need the ID's to go above 11,000 for splitting varchars?

I stand corrected. I thought the Tally value would need to be able to join to the ID. My mistake.

However, if I compare the actual and estimated execution plans of the query I posted and the query you posted, your version has a 94% total cost vs. a 6% total cost for mine.

If you view the query plans, you will see that SQL Server still performs nested loops to get the data. Just because you don't use the WHILE keyword doesn't mean you're not doing a loop.

When analyzing the cost of the individual steps in the query plan of your query, what do you think is the largest performance hit?

If you guessed the Clustered Index Seek on the Tally table ..... you are correct!!!! It accounted for 33% of the total cost of your query.

When analyzing the cost of the individul steps in the query plan of my query, what do you think is the smallest performance hit?

If you guessed the Split UDF, you are correct again. The UDF accounted for less than 1% of the total cost of my query. The execution plan actually shows 0%, but it's rounding off.

My blog: SQL Soldier[/url]
SQL Server Best Practices: SQL Server Best Practices
Twitter: @SQLSoldier
My book: Pro SQL Server 2008 Mirroring[/url]
Microsoft Certified Master: SQL Server, Data Platform MVP
Database Engineer at BlueMountain Capital Management[/url]

Jeff Moden SSC Guru Points: 1004392 More actions · Answer 9

(Not sure what happened to the post you responded to)

Yep... I could be wrong... please post your function and your test code some I might do some testing... I will, of course, post all the test code.

For starters, here're the test tables I intend to use for performance and resource testing... I've always been curious as to whether a good While loop function will beat a Tally table... Who knows? Could be incentive for me not to use one anymore...

SET NOCOUNT ON

--===== Create and populate a small test table as TableA.

SELECT TOP 10000

ID = 'M'+LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3)),

Nodes = LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))+';'

+ LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))+';'

+ LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3)),

LT = CAST(RAND(CAST(NEWID() AS VARBINARY))*100 AS INT)

INTO dbo.TableA

FROM Master.dbo.SysColumns sc1,

Master.dbo.SysColumns sc2

--===== Create single position CSVs

UPDATE dbo.TableA

SET Nodes = LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))

WHERE RIGHT(ID,1) = '3'

--===== Create 2 position CSVs

UPDATE dbo.TableA

SET Nodes = LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))+';'

+ LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))

WHERE RIGHT(ID,1) = '5'

--===== Create places where the position is blank for CSVs

UPDATE dbo.TableA

SET Nodes = REPLACE(REPLACE(REPLACE(Nodes,';0;',';;'),'0;',';'),';0',';')

--===== Create places where there are nulls for all columns

UPDATE dbo.TableA

SET Nodes = NULL

WHERE RIGHT(ID,1) = '7'

--===== Create and populate TableB

SELECT TOP 101

ID = IDENTITY(INT,0,1),

[Desc] = CAST(NULL AS VARCHAR(20)),

Qty = CAST(RAND(CAST(NEWID() AS VARBINARY))*100 AS INT)

INTO TableB

FROM Master.dbo.SysColumns sc1,

Master.dbo.SysColumns sc2

UPDATE TableB

SET [Desc] = 'Desc of ' + CAST(ID AS VARCHAR(10))

--===== Display partial content of both tables just for grins

-- SELECT TOP 10 * FROM TableA

-- SELECT TOP 10 * FROM TableB

DECLARE @StartTime DATETIME

SET @StartTime = GETDATE()

---------------------------------------------------------------------------

... insert code to be tested here...

---------------------------------------------------------------------------

PRINT DATEDIFF(ms,@StartTime,GETDATE())

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004392 More actions · Answer 10

But I still get a Incorrect Syntax near '.' error when I run the query by Robert Davis.

Select A.ID As TableAID, A.LT, B.ID As TableBID, B.[Desc], B.Qty

From TableA A

Inner Join TableB B On B.ID IN

(Select Element From dbo.SplitToInt(A.Nodes, ';'))

SQL Junkie...

I tested your function and it works fine like this...

SELECT *

FROM dbo.SplitToInt('1;2;3;4;5;6;7',';')

...But I don't believe you can use it in a correlated subquery where one of the operands is a column in the external query as in Robert's example... hence, the consistent error at the ".". In fact, I don't believe you can ever use a column name in association with a function that returns a TABLE data type.

Could be wrong but I can't get his example to work... still waiting for him to post the function so we can see what he's talking about and how he got it to work. It would be great if it worked 'cause I can think of lots of uses for it... I just don't think it can be made to work with a column name as an operand and a TABLE data type as a return.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004392 More actions · Answer 11

Hey Robert!

Man, I just tried your fnStripTags function and I converted it to use a Tally table and tested both... I'm very well impressed! In this case, the WHILE loop version (your version) runs twice as fast as the Tally table version (wrote it 2 different ways{one with STUFF and 1 with concatenation}, to boot!).

Now I really would like to see what you came up with for a split function! It's gotta be better than the Tally table version...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)