Better Way to Perform this Query

Question

Post reply

Better Way to Perform this Query

AndrewSQLDBA

SSC-Insane

Points: 21827
More actions
January 9, 2013 at 10:26 am

#269086

Hello Everyone
I hope that everyone is having a very good day.
I need to write a query to select row counts, but this seems like a lot of scanning of the rows. Is there a better way to write a query that does this? There are approx 5 million rows in the table
DECLARE @abc int
DECLARE @def int
DECLARE @Ghi int
DECLARE @jkl int
DECLARE @mno int
DECLARE @pqr int
DECLARE @stu int
DECLARE @vw int
DECLARE @xzy int
SET @abc = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%abc%')
SET @def = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%def%')
SET @Ghi = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%ghi%')
SET @jkl = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%jkl%')
SET @mno = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%mno%')
SET @pqr = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%pqr%')
SET @stu = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%stu%')
SET @vw = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%vw%')
SET @xzy = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%xyz%')
Thank You in advance for all your help, advice and suggestions.
Andrew SQLDBA

Viewing 15 posts - 1 through 15 (of 31 total)

You must be logged in to reply to this topic. Login to reply

matzer SSCrazy Eights Points: 9727 More actions · Answer 1

You can try something like this (requires only a single table scan):

SELECT

@abc = COUNT(CASE WHEN ColumnName LIKE '%abc%' THEN 1 END),

@def = COUNT(CASE WHEN ColumnName LIKE '%def%' THEN 1 END),

@Ghi = COUNT(CASE WHEN ColumnName LIKE '%ghi%' THEN 1 END),

@jkl = COUNT(CASE WHEN ColumnName LIKE '%jkl%' THEN 1 END),

@mno = COUNT(CASE WHEN ColumnName LIKE '%mno%' THEN 1 END),

@pqr = COUNT(CASE WHEN ColumnName LIKE '%pqr%' THEN 1 END),

@stu = COUNT(CASE WHEN ColumnName LIKE '%stu%' THEN 1 END),

@vw = COUNT(CASE WHEN ColumnName LIKE '%vw%' THEN 1 END),

@xzy = COUNT(CASE WHEN ColumnName LIKE '%xyz%' THEN 1 END)

FROM

TableName

AndrewSQLDBA SSC-Insane Points: 21827 More actions · Answer 2

AndrewSQLDBA

SSC-Insane

Points: 21827

January 9, 2013 at 1:30 pm

#1575420

Thank You Peter

Very handy to know.

Andrew SQLDBA

Alan Burstein SSC Guru Points: 61141 More actions · Answer 3

AndrewSQLDBA (1/9/2013)
Hello Everyone
I hope that everyone is having a very good day.
I need to write a query to select row counts, but this seems like a lot of scanning of the rows. Is there a better way to write a query that does this? There are approx 5 million rows in the table
DECLARE @abc int
DECLARE @def int
DECLARE @Ghi int
DECLARE @jkl int
DECLARE @mno int
DECLARE @pqr int
DECLARE @stu int
DECLARE @vw int
DECLARE @xzy int
SET @abc = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%abc%')
SET @def = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%def%')
SET @Ghi = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%ghi%')
SET @jkl = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%jkl%')
SET @mno = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%mno%')
SET @pqr = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%pqr%')
SET @stu = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%stu%')
SET @vw = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%vw%')
SET @xzy = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%xyz%')

Its worth noting that the following is sargable.

LIKE '%abc%'

Question:

Are you really trying to find columnNames with abc in their name, def in their name, etc... Or are you trying to get a count of all columns whose name begins with [a-c] for @abc; a count of names beginning with [d-f] for @def, etc, etc... If that's what you are looking for, you can try the query below. It's a way to pass a variable or parameter to NTILE and do what you are doing dynamically (for the most part, some assembly required).

DECLARE @groups int = 9;

;WITH

asciichar(n,c) AS

(SELECT (65), CHAR(65)

UNION ALL

SELECT n+1, CHAR(n+1) FROM asciichar WHERE n+1<=90),

groups AS

(SELECTNTILE(@groups) OVER (ORDER BY n) AS groupid, * FROM asciichar ),

ntileMatrix AS

(SELECTgroupid,

MIN(c) OVER (PARTITION BY groupid)+'-'+

MAX(c) OVER (PARTITION BY groupid) AS [group],

n AS [ASCII],

c AS [CHAR]

FROM groups ),

people AS

(SELECTLEFT(LastName,1) AS c1, LastName+', '+FirstName AS Name

FROM AdventureWorks2008R2.person.person )

SELECT nm.[group], p.Name

FROM ntileMatrix nm

CROSS JOIN people p

WHERE [CHAR]=c1

The code above will produce this (truncated for reading):

group Name

----- -------------------------------------

A-C Abbas, Syed

A-C Abel, Catherine

A-C Abercrombie, Kim

....

Y-Z Zukowski, Jake

Y-Z Zwilling, Michael

You can change it to this:

DECLARE @groups int = 9;

;WITH

asciichar(n,c) AS

(SELECT (65), CHAR(65)

UNION ALL

SELECT n+1, CHAR(n+1) FROM asciichar WHERE n+1<=90),

groups AS

(SELECTNTILE(@groups) OVER (ORDER BY n) AS groupid, * FROM asciichar ),

ntileMatrix AS

(SELECTgroupid,

MIN(c) OVER (PARTITION BY groupid)+'-'+

MAX(c) OVER (PARTITION BY groupid) AS [group],

n AS [ASCII],

c AS [CHAR]

FROM groups ),

people AS

(SELECTLEFT(LastName,1) AS c1, LastName+', '+FirstName AS Name

FROM AdventureWorks2008R2.person.person ),

Totals AS

(

SELECT nm.[group], p.Name

FROM ntileMatrix nm

CROSS JOIN people p

WHERE [CHAR]=c1 )

SELECT [group], COUNT([group]) AS groupCount

FROM Totals

GROUP BY [group]

To get this:

group groupCount

----- -----------

A-C 3502

D-F 1111

G-I 2572

J-L 2347

M-O 2061

P-R 3195

S-U 2800

V-X 1552

Y-Z 832

Again, some assembly required for what you are doing....

What's cool is you can change the parameter or variable to dynamically change your groups like so:

DECLARE @groups int = 3;

and get this:

group groupCount

----- -----------

A-I 7185

J-R 7603

S-Z 5184

Edit: Typo's

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Dwain Camps SSC Guru Points: 86978 More actions · Answer 4

Alan - That's a very intriguing use of NTILE!

I would recommend, however that you change the way you construct your asciichar table:

;WITH asciichar(n, c) AS (

SELECT n=64+number, CHAR(64+number)

FROM [master].dbo.spt_values Tally

WHERE [Type] = 'P' AND Number BETWEEN 1 AND 26)

SELECT *

FROM asciichar

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 5

Alan.B (1/9/2013)

AndrewSQLDBA (1/9/2013)
Its worth noting that the following is sargable.
LIKE '%abc%'

Did you mean "NOT" SARGable because it sure doesn't look SARGable from here. 😉 And, I agree... nice use of NTILE.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Dwain Camps SSC Guru Points: 86978 More actions · Answer 6

Jeff Moden (1/9/2013)

Alan.B (1/9/2013)
AndrewSQLDBA (1/9/2013)
Its worth noting that the following is sargable.
LIKE '%abc%'
Did you mean "NOT" SARGable because it sure doesn't look SARGable from here. 😉

Is that because of the % at the beginning of the string?

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 7

Yes.

It would also give the wrong answer, IMHO.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Alan Burstein SSC Guru Points: 61141 More actions · Answer 8

dwain.c (1/9/2013)
Alan - That's a very intriguing use of NTILE!
I would recommend, however that you change the way you construct your asciichar table:
;WITH asciichar(n, c) AS (
SELECT n=64+number, CHAR(64+number)
FROM [master].dbo.spt_values Tally
WHERE [Type] = 'P' AND Number BETWEEN 1 AND 26)
SELECT *
FROM asciichar

Thank you, and thank you. I'm still just learning Windows functions (have been studying them for ~6 months now give or take.) For me it's been easy to find a uses for ROW_NUMBER , RANK and DENSE_RANK but this was a rare case where I found some use for NTILE; this was originally a dynamic SQL query.

I agree that I should have used a tally table. I've been writing CTE's for counting for awhile and can do so while sleeping. I still fumble around with the tally table and, in this case was in a hury to post my code. I just updated my query to include the tally table as you showed above: 303 reads is now 146 reads. 🙂

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Alan Burstein SSC Guru Points: 61141 More actions · Answer 9

Jeff Moden (1/9/2013)

Alan.B (1/9/2013)
AndrewSQLDBA (1/9/2013)
Its worth noting that the following is sargable.
LIKE '%abc%'
Did you mean "NOT" SARGable because it sure doesn't look SARGable from here. 😉 And, I agree... nice use of NTILE.

Thank you very much Jeff. One year ago I had never heard of windows functions, set-based SQL, or a tally table. I think I'm starting to get it and hope to someday be able to write queries like you, Dwain, Lynn and others on SSC.

Yes - I meant NOT SARGable.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Sean Lange SSC Guru Points: 286590 More actions · Answer 10

Alan.B (1/10/2013)
I agree that I should have used a tally table. I've been writing CTE's for counting for awhile and can do so while sleeping. I still fumble around with the tally table and, in this case was in a hury to post my code. I just updated my query to include the tally table as you showed above: 303 reads is now 146 reads. 🙂

You might want to take a look at this article.

http://www.sqlservercentral.com/articles/T-SQL/74118/[/url]

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Alan Burstein SSC Guru Points: 61141 More actions · Answer 11

Sean Lange (1/10/2013)
Alan.B (1/10/2013)
I agree that I should have used a tally table. I've been writing CTE's for counting for awhile and can do so while sleeping. I still fumble around with the tally table and, in this case was in a hury to post my code. I just updated my query to include the tally table as you showed above: 303 reads is now 146 reads. 🙂
You might want to take a look at this article.
http://www.sqlservercentral.com/articles/T-SQL/74118/[/url]

Thank you Sean. Yes, I have a few times - it's in my favorites folder and I have been putting what I have learned to use. You won't see any more counting CTE's from me. This morning before work for example, after a lot of effort, I finally figured out how to get the Levenshtein Edit Distance between 2 strings without a loop (just for fun because it's something I've never seen done).

I came up with this:

-- strings to compare

DECLARE@s1 varchar(8000)='diner',

@s2 varchar(8000)='dinerr';

DECLARE @Ld int=ABS(LEN(@s1)-LEN(@s2));

IF ((@s1=@s2) OR ((ISNULL(LEN(@s1)*LEN(@s2),0)=0))) BEGIN GOTO LD END;

DECLARE@minlen int=CASE WHEN LEN(@s1)>LEN(@s2) THEN LEN(@s2) ELSE LEN(@s1) END;

;WITH

nums(n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM [master].dbo.spt_values Tally),

matrix AS (SELECT SUBSTRING(@s1,n,1) s1, SUBSTRING(@s2,n,1) s2 FROM nums WHERE n<=@minlen)

SELECT @Ld+=COUNT(*) FROM matrix WHERE s1<>s2;

LD:

SELECT @Ld AS LD;

I actually posted this as a script and hope it gets approved. The Tally table is new to me, I did not get it at first but now I totally understand what all the hype is about. Long live the Tally table!

Edit: typo.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Greg Snidow SSCoach Points: 16108 More actions · Answer 12

Alan.B
(just for fun because it's something I've never seen done

Here, here brother. That's what its all about. Well, a lot of it anyway.

Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 13

Alan.B (1/10/2013)

This morning before work for example, after a lot of effort, I finally figured out how to get the Levenshtein Edit Distance between 2 strings without a loop (just for fun because it's something I've never seen done).

Heh... if that's how you warm up for the day, then you've got me beat by a mile.

You can just bet I'm going to do a deep dive on your rendition of this famous problem... especially since you did it with a Tally Table. Thanks for doing it and thanks for posting it.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Dwain Camps SSC Guru Points: 86978 More actions · Answer 14

Jeff Moden (1/10/2013)

Alan.B (1/10/2013)
This morning before work for example, after a lot of effort, I finally figured out how to get the Levenshtein Edit Distance between 2 strings without a loop (just for fun because it's something I've never seen done).
Heh... if that's how you warm up for the day, then you've got me beat by a mile.
You can just bet I'm going to do a deep dive on your rendition of this famous problem... especially since you did it with a Tally Table. Thanks for doing it and thanks for posting it.

I for one have never heard of this famous "Levenshtein Edit Distance" problem, but you can rest assured that now I'm going to have to take a look at it as well! 😀

http://en.wikipedia.org/wiki/Levenshtein_distance

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St