SQLServerCentral Article

An Overview of STRING_SPLIT()

,

The STRING_SPLIT() function is a great way to manage parameters in T-SQL. This was added in SQL Server 2016 and is designed to quickly break apart a string that has a single character delimeter. While not perfect, STRING_SPLIT() performs well and is a good function to add to your T-SQL toolbelt.

In this article, we will examine the basics of how STRING_SPLIT() works and show a few cases where this function is useful. We will also examine some of the limitations of the function and give alternative solutions for splitting strings.

The Basics

The STRING_SPLIT function is a table valued function, which means this function can be used in place of a table in your code. Anyplace that you use a table, in FROM and JOIN clauses, is where you'd use this function. The function takes two parameters. The first is the string to be split and the second is a separator. The syntax is as follows:

STRING_SPLIT(string, separator)

We can see the function work in a short example:

DECLARE @params VARCHAR(100) = '8,6,7,5,3,0,9';
SELECT *
 FROM STRING_SPLIT(@params,',') AS ss;

This returns:

value
-----------------
8
6
7
5
3
0
9
(7 rows affected)

There is a single column, value, returned. I can also use value in the SELECT column list, like this:

SELECT ss.value
 FROM STRING_SPLIT(@params,',') AS ss;

Simple Splits

The string can be any character string. This includes char, varchar, nchar, and nvarchar. If you pass in a Unicode type, the Unicode type is returned. If you pass in a non-Unicode type, you'll get non-Unicode results. We can see this in a few ways.

The separator is a single character value that you can choose. Above I've chosen a comma, which is very common. However, I could also choose spaces. If I were passing in a sentence, then I would see something like this:

DECLARE @sentence VARCHAR(1000)
    = 'A room without books is like a body without soul.';
SELECT value FROM STRING_SPLIT(@sentence, ' ') AS ss;

The results:

value
-----------------
A
room
without
books
is
like
a
body
without
soul.
(10 rows affected)

As you can see, each word is broken at the space and appears on a separate row of the results.

I can choose any separator I want, which can be useful for different types of values. For example, suppose I had a series an IP address and wanted to split that. I could use the period as a separator, as shown here (code and results combined). I also also rename the result column as I would for any other column selected from a table.

DECLARE @sentence VARCHAR(1000) = '192.168.1.210';
SELECT value AS 'IP Address'
 FROM STRING_SPLIT(@sentence, '.') AS ss;
-- Results
IP Address
--------------
192
168
1
210

I can even choose other types of characters if I had some strange sort of data import I wanted to split, such as a phone number.

DECLARE @sentence VARCHAR(1000) = '303#4444#55555#666666';
SELECT value 
 FROM STRING_SPLIT(@sentence, '#') AS ss;
-- Results
value
-----------
303
4444
55555
666666

This isn't a perfect system, as we'll see later, but this does give me some flexibility in how to separate items.

Breaking Apart Parameters

One of the very common uses is to separate a list of values that I want to pass into a stored procedure or function. Often I want to query for a variety of values in a table and combine all my search items into a single stored procedure call. As an example, suppose I have a list of Orders in a table and I want to get back the details of which orders used a set of product codes. In my OrderItem table, I have some data like this:

OrderItemID ProductCode                    SubProductCode                 Quantity    UnitPrice
----------- ------------------------------ ------------------------------ ----------- ---------------------
1           SJ5A                           KK                             -1354214955 153.0308
2           RXZP                           TJ                             -1321241122 619.5019
3           RXZP                           TJ                             -48338399   314.2388

I build a stored procedure, called GetOrders, to retrieve order information. The stored procedure will take a parameter that contains a list of product codes and then uses the APPLY operator to join this list to the table. The STRING_SPLIT() function breaks the parameters apart and returns a table, which can then be joined back to the original table to find matching rows. Here is the procedure:

CREATE PROCEDURE GetOrders
   @Orders varchar(1000) = NULL
as
BEGIN
 if @Orders is not null 
    select O.OrderItemID, o.ProductCode, o.SubProductCode, o.Quantity, o.UnitPrice
      from OrderItems o
         inner join string_split(@Orders, ',') s
             on o.productcode = s.value
END

When we run this with a limited list of values:

DECLARE @orderlist VARCHAR(100) = 'SJ5A, LLAL,PC64';
EXEC GetOrders @orderlist

I get these results

OrderItemID ProductCode                    SubProductCode                 Quantity    UnitPrice
----------- ------------------------------ ------------------------------ ----------- ---------------------
1           SJ5A                           KK                             -1354214955 153.0308
4           PC64                           PB                             -825611450  697.8801
6           PC64                           PB                             1250957649  663.3212
7           PC64                           PB                             -78519823   832.512
11          SJ5A                           KK                             -1189690266 319.2531
16          PC64                           PB                             148607121   714.4903
18          SJ5A                           BA                             1053775591  665.0404

What happens if I send in only one value?

DECLARE @orderlist VARCHAR(100) = 'PC64';
EXEC GetOrders @orderlist
-- Results
OrderItemID ProductCode                    SubProductCode                 Quantity    UnitPrice
----------- ------------------------------ ------------------------------ ----------- ---------------------
4           PC64                           PB                             -825611450  697.8801
6           PC64                           PB                             1250957649  663.3212
7           PC64                           PB                             -78519823   832.512
16          PC64                           PB                             148607121   714.4903

If I send in an empty string, I get nothing back, as none of my products match an empty string. You can read a bit more about different types of delimeters in Wayne Sheffield's article, Splitting Strings in SQL Server 2016, but this works as you might expect and splits items on the delimeter, but doesn't error out on valid delimeters.

I have been showing varchar values, but this works well with Unicode characters as well, either in the string or the separator.

DECLARE @sentence NVARCHAR(1000)
    = 'This sun character, ?, in Japanese is ?.';
SELECT value FROM STRING_SPLIT(@sentence, '?');
-- Results
value
------------------------
This sun character, 
, in Japanese is 
.

A Few Limitations

The biggest limitation of the STRING_SPLIT() function is that it doesn't guarantee any ordering of the results. I can run this 1,000 times and get these results:

DECLARE @sentence VARCHAR(1000) = 'The quick brown fox runs over the lazy dog on S@undays'
SELECT * FROM STRING_SPLIT(@sentence, ' ')
--Results
value
-------------------
The
quick
brown
fox
runs
over
the
lazy
dog
on
S@undays

That doesn't mean the 1,001st time the same results come. Just like with any table structure, there is no ordering here without an ORDER BY. STRING_SPLIT() doesn't give us any ordering fields. We can order by value, but we'd then be ordering by the items in each row, not by their order in the original parameter.

DECLARE @sentence VARCHAR(1000) = 'The quick brown fox runs over the lazy dog on S@undays'
SELECT value
 FROM STRING_SPLIT(@sentence, ' ')
 ORDER BY value
--Results
value
--------------------
brown
dog
fox
lazy
on
over
quick
runs
S@undays
the
The

There has been a request to fix this, by either guarenteeing order or providing some orginal column, but this hasn't been implemented as of the early SQL Server 2019 betas. If you think this is important. If you do need to have ordering, the OPENJSON function will provide this and allow you to get orders, though you must submit a valid JSON string to the function.

The other limitation of this function is that the separator can only be one character. This prevents me from splitting on multiple characters in a row or using a regular experession of some sort that would allow me to split on two characters, such as a comma (,) or a pipe (|).

One last limitation is that this function requires string input. Many values that applications might want to pass in as a list are array based, often with other data types. That data must be converted to a string and a separator provided that matches the code in your database function or procedure.

Alternative Methods of Splitting Strings

This is a function that performs extremely well. If you examine the articles in the References section below, there are more performance tests, which I will not duplicate here or attempt to replace. In those tests, the OPENJSON() function, the CLR, Jeff Moden's string splitter, and more are compare. The STRING_SPLIT() function performs very well in these tests, and is appropriate for use in many situations. There are alternatives, and you should test your workload to ensure that the resources required for your situation are acceptable.

Conclusion

This function is a useful one for accepting a list of values and breaking them apart into a table structure that can be used with the JOIN or APPLY operators to query SQL Server tables. This is an efficient function and worth considering in places where you need this functionality.

References

A few references and more reading on the topic:

Rate

4.9 (10)

You rated this post out of 5. Change rating

Share

Share

Rate

4.9 (10)

You rated this post out of 5. Change rating