Up until SQL Server 2016, we had to write our own functions to split a CSV string into a table list. This was accomplished by writing a table value user defined function.
These functions are sprinkled all over the web and for the most part two examples are given. One using casts the string as XML and then parses the nodes to return a table and the other uses a very inefficient looping operation with charindex and substring. For performance reasons I’ve always used the XML function; but, with SQL 2016 I don’t have to worry any more, or do I?
What’s new in 2016?
In 2016 there’s a new function named STRING_SPLIT. This function is used like the sys.dm_exec_sql_text funtion. To join it to a table you need to use CROSS APPLY.
This new function is very simple. It takes a string or a column as input and a delimiter. It then outputs the source as a table.
STRING_SPLIT([COLUMN_NAME], [DELIMITER])
Comparing new and old
I’m all about performance. This being said, the first thing I wanted to see is how this new function performed. I setup the following example and before you ask; Yes, I love food! If you ever get a chance to visit Austin, Texas be sure to check out the “Hey! You gonna eat or what?” food truck. They have an amazing Shiner Bock beer battered Monte Cristo that is over the top!!
CREATE TABLE #CSVList
( Name varchar(20),
Ingredients varchar(500) )
INSERT INTO #CSVList (Name, Ingredients)
VALUES (‘Turky Cheese’, ‘Turkey,Cheese,Bread,Tomato,Salt’),(‘Roast Beef’, ‘Roast Beef,Bread,Horse Radish’),
(‘Holiday Delight’, ‘Turkey,Cranberry Chutney,Bread,Stuffing’),
(‘Healthy Breakfast’, ‘Cream Cheese,Salmon,Bagel,Purple Onion’),
(‘BLT’, ‘Bacon,Lettuce,Tomato,Bread’),
(‘French Toast’, ‘French Toast,Bacon,Confectioners Sugar’),
(‘Monte Cristo’, ‘Bread,Ham,Turkey,Cheese,Jam,Confectioners Sugar’),
(‘Peanut Butter Jelly’, ‘Bread,Peanut Butter,Raspberry Jam,Whipped Cream’)
GO
Here’s a simple example of how we would use the new function:
SELECT * FROM#CSVList L
CROSS APPLY STRING_SPLIT(L.Ingredients, ‘,‘) i
Now let’s look at the old example:
SELECT f1.Name, O.splitdata
FROM(
SELECT *,
cast(‘<X>’+replace(F.Ingredients,‘,’,‘</X><X>’)+‘</X>’ as XML) as xmlfilter FROM #CSVList F)F1
CROSS APPLY
(
SELECT fdata.D.value(‘.’,‘varchar(50)’) as splitdata
FROM f1.xmlfilter.nodes(‘X’) as fdata(D)) OGO
Comparing the two plans together you’d think that the new function was exponentially faster. The function only has an estimated cost of 0.0004002 vs 2971.31 for the subtree of the older example. Making the new query cost 0% for the overall batch.
Looking a little deeper
When in doubt use the tools you were given. I once saw a quote saying “A fact without supporting data is just another opinion.”
The STATISTICS IO output for both queries is the same so I looked at STATISTICS TIME. Here’s what I found:
STRING_SPLIT results
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 4 ms.
(33 row(s) affected)
Custom function results
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 34 ms.
(33 row(s) affected)
As you can see, the new split function is much faster and the clear winner. I’ll be posting about other string functions soon.
Be sure to check out my other blog posts on SQL Server Central here: http://www.sqlservercentral.com/blogs/confessions-of-a-microsoft-addict/
Follow me on twitter (https://twitter.com/SQLSME)
And on LinkedIn (https://www.linkedin.com/in/daniel-janik-60615b12)
Find resources I like at my blog (http://www.sqltechblog.com)
Get a free SQL Server health check at my site (http://www.sqldatapros.com)