June 17, 2014 at 3:59 pm
Hi All,
See sample data below. I'm trying to count the number of occurrences of strings stored in table @word without a while loop. Any help is appreciated. Thank you.
DECLARE @t TABLE (Id INT IDENTITY(1,1), String VARCHAR(MAX))
INSERT INTO @t
SELECT 'There are a lot of Multidimensional Expressions (MDX) resources available' AS String UNION ALL
SELECT 'but most teaching aids out there are geared towards professionals with cube development experience' UNION ALL
SELECT 'As a result SQL developers with no cube development experience start learning MDX on a poor footing' UNION ALL
SELECT 'because many of the learning aids completely disregard SQL as a good frame of reference for starting the learning process' UNION ALL
SELECT 'I am going to introduce MDX' UNION ALL
SELECT 'by drawing only on the similarity and differences between MDX and SQL' UNION ALL
SELECT 'and more importantly tackle some core MDX and cube concepts along the way' UNION ALL
SELECT 'In this first part, I will explain how to navigate the cube object with MDX' UNION ALL
SELECT 'by cutting through some of the quirkiness' UNION ALL
SELECT 'that makes MDX different from T-SQL even though both were are derived from SQL language' UNION ALL
SELECT 'Before we move on let’s see how' UNION ALL
SELECT 'SQL Server 2012 is changing SQL Server data analytics' UNION ALL
SELECT 'and why MDX is still very important in that regard'
DECLARE @word TABLE (Id INT IDENTITY(1,1), Word VARCHAR(100))
INSERT INTO @word
SELECT 'SQL' AS Word UNION ALL
SELECT 'MDX' UNION ALL
SELECT 'cube' UNION ALL
SELECT '2012' UNION ALL
SELECT 'learning'
SELECT * from @t
SELECT * from @word
-- Expected result
SELECT 'SQL' AS Word, 7 AS WordCount UNION ALL
SELECT 'MDX', 8 UNION ALL
SELECT 'cube', 4 UNION ALL
SELECT '2012', 1 UNION ALL
SELECT 'learning', 3
June 17, 2014 at 4:15 pm
appreciate that you have provided sample data....but in your "real" world and to ensure scalability for a solution...can you tell us approx. number of rows in @t table and confirm that you need "string" to be varchar(max)
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
June 17, 2014 at 4:21 pm
also, can you please tell us whether you are looking for exact matches or whether partial matches count?
e.g. if you are looking for "SQL", does "MySQL" count as a hit?
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
June 17, 2014 at 4:35 pm
Thank you for your quick response. Table @t may store up to 25,000 rows (I used table variable just for demonstration purposes) and yes it is a varchar(max).
Yes, I would like "MySQL" to be included too.
Thank you.
June 17, 2014 at 4:41 pm
select Word,sum((len(t.string)-len(replace(t.string,Word,'')))/len(Word)) as count
from @word as w
cross join @t as t
group by Word
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
June 17, 2014 at 5:25 pm
Thank you!
I have two follow up questions.
Is there a way to change the query so that it only counts the word(s) once per row?
Also, I would like to count the rows that do not include any of those words and return the count in the result set (maybe with a union).
Thank you for your time.
June 17, 2014 at 6:02 pm
Maybe something like this?
SELECT ISNULL( w.Word, 'None'),
COUNT(t.String),
ISNULL( SUM((LEN(t.String)-LEN(REPLACE(t.String,w.Word,'')))/LEN(w.Word)), COUNT(t.String))
FROM @word w
RIGHT
JOIN @t t ON String LIKE '%' + Word + '%'
GROUP BY Word
June 17, 2014 at 9:10 pm
clayman (6/17/2014)
Thank you for your quick response. Table @t may store up to 25,000 rows (I used table variable just for demonstration purposes) and yes it is a varchar(max).Yes, I would like "MySQL" to be included too.
Thank you.
Understood that it IS a VARCHAR(MAX). I believe the question is, does it have to be? The reason why everyone is asking that is because there's a function that might be able to do a whole lot more than you might expect for such a thing. The only trouble with it is that it runs twice as slow if it's converted to VARCHAR(MAX).
--Jeff Moden
Change is inevitable... Change for the better is not.
June 17, 2014 at 9:51 pm
Jeff Moden (6/17/2014)
clayman (6/17/2014)
Thank you for your quick response. Table @t may store up to 25,000 rows (I used table variable just for demonstration purposes) and yes it is a varchar(max).Yes, I would like "MySQL" to be included too.
Thank you.
Understood that it IS a VARCHAR(MAX). I believe the question is, does it have to be? The reason why everyone is asking that is because there's a function that might be able to do a whole lot more than you might expect for such a thing. The only trouble with it is that it runs twice as slow if it's converted to VARCHAR(MAX).
The data type in the database is TEXT (I know..) so I'm converting it to VARCHAR(MAX) in the stored procedure. Whether it has to be VARCHAR(MAX) or not, I'm not sure about that. All I know these string can pretty long and I don't want to lose any data.
June 18, 2014 at 12:38 am
clayman (6/17/2014)
Thank you!I have two follow up questions.
Is there a way to change the query so that it only counts the word(s) once per row?
Also, I would like to count the rows that do not include any of those words and return the count in the result set (maybe with a union).
Thank you for your time.
Yes, try this
select isnull(Word,'No Match') as Word,sum(1) as count
from @t t
left outer join @word as w
on t.string like '%'+w.word+'%'
group by Word
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
July 14, 2014 at 6:54 pm
Thank you.
I would also like to search for synonym words, however they should not be counted separately. Note, these synonyms could also be placed in a new table, they don't have to be comma separated strings.
Would full text indexing make more sense in this scenario (then I could just add those synonyms in the tsenu.xml file)? Thank you.
DECLARE @t TABLE (Id INT IDENTITY(1,1), String VARCHAR(MAX))
INSERT INTO @t
SELECT 'I bought a new shirt, I purchased a new t-shirt.' AS String UNION ALL
SELECT 'He looks very miserable today. He looks very sad today.' UNION ALL
SELECT 'I’ve been searching for that book for weeks!I’ve been looking for that book for weeks!' UNION ALL
SELECT 'He looks sad since he could not afford to buy a shirt'
DECLARE @word TABLE (Id INT IDENTITY(1,1), Word VARCHAR(100), WordSyn VARCHAR(500))
INSERT INTO @word
SELECT 'buy' AS Word,'bought, purchase, purchased, obtain, obtained' AS WordSynUNION ALL
SELECT 'sad','miserable, unhappy, depressed, gloomy'UNION ALL
SELECT 'search','look for, looking for, looked for, searching for, search for, searched for'
select isnull(Word,'No Match') as Word,sum(1) as count
from @t t
left outer join @word as w
on t.string like '%'+w.word+'%'
group by Word
-- Expected output would be
select 'buy' as [Word], 3 as [count] UNION ALL
select 'sad', 3 UNION ALL
select 'search', 2
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply