August 26, 2014 at 4:51 pm
Hi everyone -
I am working with a table that has a column which stores multiple data/values that are comma separated.
I need to be able to query that table and get those rows where the values in that column match a pre-defined search list.
I was thinking of somehow trying to take the search list and convert it to a table(temp or a cte) and then JOIN to the table.
however, since the column may contain multiple values, i would need to parse/separate that first. I am not sure how to parse and then join to a list (if that is even the best way to solve this) to only get the rows where the search column contains one or more of the items we're looking for.
Below is some sample data, and if there are any other articles or links that might help, please send or suggest them.
Thanks much - will
Declare @BaseTable table (PKCol int, Column2Search varchar(2000))
Insert into @BaseTable (PKCol, Column2Search)
Select 1001, 'apple,orange,grapefruit'
UNION ALL
Select 1002, 'grapefruit,coconut'
UNION ALL
Select 1003, 'toast'
UNION ALL
Select 1004, 'cookies,apple,crackers'
Declare @SearchList table (SearchValue varchar(2000) )
Insert Into @SearchList
Select 'apple'
UNION ALL
Select 'orange'
UNION ALL
Select 'grapefruit'
UNION ALL
Select 'pear'
August 26, 2014 at 6:13 pm
This is one of the reasons to normalize our tables. To help you with this, I'm using a delimited string splitter (or I should say THE delimited string splitter). You can find information on it in here: http://www.sqlservercentral.com/articles/Tally+Table/72993/
With that function, the query is quite simple
SELECT *
FROM @BaseTable
WHERE EXISTS(
SELECT *
FROM DelimitedSplit8K( Column2Search, ',') s
JOIN @SearchList l ON s.Item = l.SearchValue)
August 26, 2014 at 9:53 pm
Here is an alternative solution using CHARINDEX
😎
Declare @BaseTable table (PKCol int, Column2Search varchar(2000))
Insert into @BaseTable (PKCol, Column2Search)
Select 1001, 'apple,orange,grapefruit'
UNION ALL
Select 1002, 'grapefruit,coconut'
UNION ALL
Select 1003, 'toast'
UNION ALL
Select 1004, 'cookies,apple,crackers'
Declare @SearchList table (SearchValue varchar(2000) )
Insert Into @SearchList
Select 'apple'
UNION ALL
Select 'orange'
UNION ALL
Select 'grapefruit'
UNION ALL
Select 'pear'
SELECT
bt.PKCol
,bt.Column2Search
,SL.SearchValue
FROM @BaseTable BT
OUTER APPLY @SearchList SL
WHERE CHARINDEX(SL.SearchValue,BT.Column2Search,1) > 0;
Results
PKCol Column2Search SearchValue
------ ------------------------- ------------
1001 apple,orange,grapefruit apple
1001 apple,orange,grapefruit orange
1001 apple,orange,grapefruit grapefruit
1002 grapefruit,coconut grapefruit
1004 cookies,apple,crackers apple
August 26, 2014 at 10:45 pm
Thanks to both of you for the response and solutions. I'll try them out and see how things go.
I appreciate the help
- will
August 27, 2014 at 1:05 am
Eirikur Eiriksson (8/26/2014)
Here is an alternative solution using CHARINDEX
How well does that work if you add 'grape' to the search list?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 27, 2014 at 1:59 am
GilaMonster (8/27/2014)
Eirikur Eiriksson (8/26/2014)
Here is an alternative solution using CHARINDEXHow well does that work if you add 'grape' to the search list?
Good point, that will ferment into a false positive;-) The solution is to wrap the entries inside the delimiters.
😎
Declare @BaseTable table (PKCol int, Column2Search varchar(2000))
Insert into @BaseTable (PKCol, Column2Search)
Select 1001, 'apple,orange,grapefruit'
UNION ALL
Select 1002, 'grapefruit,coconut'
UNION ALL
Select 1003, 'toast'
UNION ALL
Select 1004, 'cookies,apple,crackers'
Declare @SearchList table (SearchValue varchar(2000) )
Insert Into @SearchList
Select 'apple'
UNION ALL
Select 'orange'
UNION ALL
Select 'grapefruit'
UNION ALL
Select 'pear'
UNION ALL
Select 'grape'
UNION ALL
Select 'fruit'
SELECT
bt.PKCol
,bt.Column2Search
,SL.SearchValue
FROM @BaseTable BT
OUTER APPLY @SearchList SL
WHERE CHARINDEX(CHAR(44) + SL.SearchValue + CHAR(44),CHAR(44) + BT.Column2Search + CHAR(44),1) > 0;
Results
PKCol Column2Search SearchValue
------ ------------------------ -----------
1001 apple,orange,grapefruit apple
1001 apple,orange,grapefruit orange
1001 apple,orange,grapefruit grapefruit
1002 grapefruit,coconut grapefruit
1004 cookies,apple,crackers apple
August 27, 2014 at 2:12 am
Eirikur Eiriksson (8/27/2014)
GilaMonster (8/27/2014)
Eirikur Eiriksson (8/26/2014)
Here is an alternative solution using CHARINDEXHow well does that work if you add 'grape' to the search list?
Good point, that will ferment into a false positive;-) The solution is to wrap the entries inside the delimiters.
Until...
Declare @BaseTable table (PKCol int, Column2Search varchar(2000))
Insert into @BaseTable (PKCol, Column2Search)
Select 1001, 'apple,orange,grapefruit '
UNION ALL
Select 1002, 'grapefruit,coconut'
UNION ALL
Select 1003, 'toast'
UNION ALL
Select 1004, 'cookies,apple ,crackers'
Declare @SearchList table (SearchValue varchar(2000) )
Insert Into @SearchList
Select 'apple'
UNION ALL
Select 'orange'
UNION ALL
Select 'grapefruit'
UNION ALL
Select 'pear'
UNION ALL
Select 'grape'
UNION ALL
Select 'fruit'
SELECT
bt.PKCol
,bt.Column2Search
,SL.SearchValue
FROM @BaseTable BT
OUTER APPLY @SearchList SL
WHERE CHARINDEX(CHAR(44) + SL.SearchValue + CHAR(44),CHAR(44) + BT.Column2Search + CHAR(44),1) > 0;
Not digging at you, just trying to show the complexities of doing this kind of search. The 'best' way (IMHO) would be to normalise the table, followed by using a string spitting function. While the Charindex approach can work, it requires clean data, or a lot of extra work in the form of REPLACE functions
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 27, 2014 at 2:28 am
GilaMonster (8/27/2014)
Not digging at you..
Of course not, you would never:-D
... just trying to show the complexities of doing this kind of search. The 'best' way (IMHO) would be to normalise the table, followed by using a string spitting function. While the Charindex approach can work, it requires clean data, or a lot of extra work in the form of REPLACE functions
I fully agree as normalizing is a permanent solution. Other such as charindex and even splitting approaches tend to lead to more complex solutions as the complexity of the denormalized problem in a real life business situations materializes. The overhead of advanced matching such as fuzzy matching (jaro winkler etc.), cleansing, dictionaries, full text catalogues is most of the time too high to be justifiable when a simple normalized solution does the job better.
😎
August 27, 2014 at 9:12 am
Hi Everyone...
Again, thank you for the help and solutions.
I should have mentioned before that the database design/structure is completely and utterly out of my control. Yes, normalization of the tables would be the way to go. As it is...both of the provided solutions work well. I just need to test out with our larger (millions of rows) test data set.
Thank you all
- will
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply