Query Re-Write to Minimize Logical Reads

Question

Post reply

Query Re-Write to Minimize Logical Reads

SwedishOrr

SSCrazy

Points: 2955
More actions
July 19, 2010 at 1:34 pm

#224259

Hi Folks,
I'm trying to minimize the total number of logical reads against a specific table. I've tested the following three approaches and each results in the same number of logical reads against Table A. Are there some other T-SQL coding techniques I can try to reduce the number of logical reads?
Here are the queries:
-- Query1, Original query in App
SELECT COUNT(*)
FROM [Table A] where col2 in (SELECT DISTINCT col2 FROM [Table B] b, [Table A] a where a.col1=b.col1 and col3 = 196)
-- Query2, Re-write #1
CREATE TABLE #tempWrk(
col2 nvarchar(128) not null
);
CREATE INDEX ix_temp on #tempWrk (col2)
INSERT #tempWrk (col2)
SELECT DISTINCT col2 FROM [Table B] a, [Table A] b where a.col1=b.col1 and a.col3 = 196
SELECT COUNT(*)
FROM [Table A] t1
inner join
#tempWrk t2
on t1.col2 = t2.col2
DROP TABLE #tempWrk
-- Query3, Re-Write #2
;WITH X AS (
SELECT DISTINCT col2 FROM [Table B] a, [Table A] b where a.col1=b.col1 and a.col3 = 196
)
SELECT COUNT(*)
FROM
[Table A] t1
inner join
X on t1.col2 = X.col2
Before executing these queries, I had set statisics io on. The following is the output:
Query1:
Table '[Table B]'. Scan count 5, logical reads 12027, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '[Table A]'. Scan count 10, logical reads 33090, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Query2:
Table '[Table B]'. Scan count 5, logical reads 12027, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '[Table A]'. Scan count 5, logical reads 16545, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#tempWrk____________________________________________________________________________________________________________00000000000F'. Scan count 0, logical reads 1625295, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(416164 row(s) affected)
(1 row(s) affected)
Table '#tempWrk____________________________________________________________________________________________________________00000000000F'. Scan count 5, logical reads 1595, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '[Table A]'. Scan count 5, logical reads 16545, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Query3:
Table '[Table B]'. Scan count 5, logical reads 12027, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '[Table A]'. Scan count 10, logical reads 33090, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
The table definitions are as follows:
CREATE TABLE [dbo].[Table A](
[col1] [int] IDENTITY(1,1) NOT NULL,
[col2] [nvarchar](50) NULL,
CONSTRAINT [PK_DocInfo] PRIMARY KEY CLUSTERED ([docid] ASC)
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [idx_col2] ON [dbo].[Table A] ([col2] ASC) INCLUDE ([col1])
GO
CREATE TABLE [dbo].[Table B](
[Col3] [int] NOT NULL,
[Col4] [int] NULL,
[Col1] [int] NULL
)
CREATE NONCLUSTERED INDEX [IX_tableB] ON [dbo].[Table B] ([Col3] ASC)
From these, I conclude that each of these three methods will touch [Table A] the exact same number of times. Are there some other techniques I can try that may touch less pages? Thanks in advance!

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

LutzM SSC Guru Points: 107049 More actions · Answer 1

Wouldn't it be easier to use

SELECT COUNT(DISTINCT col2)

FROM [Table A]

WHERE col3 = 196

Or am I missing something obvious?

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 2

I'm not sure, Lutz. I'm reading his code more thoroughly right now.

He only wants counts of values that come from Table B.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

SwedishOrr SSCrazy Points: 2955 More actions · Answer 3

Thanks for the replies guys. Sorry if the use of ColX has confused us. Table A does not contain Col3 and Table B does not contain Col2. Hope that clears things up a little.

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 4

Without data to test, this is a shot in the dark, but....

SELECT COUNT(DISTINCT docid) -- docid is primary key of Table A

FROM [Table A] a

JOIN [Table B] b on a.col1=b.col1

WHERE col3 = 196

The code counts distinct doc_ids to be sure there is no twice-counting of rows from Table A due to the join. It may be LESS efficient, but we won't know until you test it.

Important question: Do you have indexes built on col1 for both [Table A] and [Table B]?

(Use of col1 as an included column doesn't count.)

I'd be interested in seeing the I/O stats, Time stats, and execution plans. Also it would be nice to know the volumes we are dealing with in both your tables.

Interesting question.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Ben Teraberry SSChampion Points: 11845 More actions · Answer 5

The Dixie Flatline (7/19/2010)
Without data to test, this is a shot in the dark, but....
SELECT COUNT(DISTINCT docid) -- docid is primary key of Table A
FROM [Table A] a
JOIN [Table B] b on a.col1=b.col1
WHERE col3 = 196
The code counts distinct doc_ids to be sure there is no twice-counting of rows from Table A due to the join. It may be LESS efficient, but we won't know until you test it.
Important question: Do you have indexes built on col1 for both [Table A] and [Table B]?
(Use of col1 as an included column doesn't count.)
I'd be interested in seeing the I/O stats, Time stats, and execution plans. Also it would be nice to know the volumes we are dealing with in both your tables.
Interesting question.

If his initial queries are correct, this won't work. He wants all the records from A where the Col2 value is found in the query you have above. So first you're doing a join from A to B to get the possible Col2 values and then you're joining back to A to get all records associated with the list of Col2 values.

I was going to respond with some suggestions, but without reasonable sample data being provided I didn't feel like guessing.

└> bt

Forum Etiquette: How to post data/code on a forum to get the best help[/url]

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 6

My reasoning was that he first produces a distinct setof values, then gets a simple count of rows from Table A which have values in that set. However the set of values is dependent on a join between A and B on col1. That join could potentially produce a result set in which rows from column A were produced twice. Doing a count distinct on the primary key would eliminate the duplicates from consideration.

But yes, I freely admitted I was shooting blind 😉

So first you're doing a join from A to B to get the possible Col2 values and then you're joining back to A to get all records associated with the list of Col2 values.

By "you", I assume that you're referring to SwedishOrr. The whole point of the code I wrote was to avoid hitting Table A twice.

It might also be helpful to all these queries to have an index on col3, because in the absence of such an index, rows from Table A have to be scanned to determine the set of rows with col3 = 196. Again, it would be nice to know the volumes involved, see the execution plans, and be sure of all the indexes.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Ben Teraberry SSChampion Points: 11845 More actions · Answer 7

The Dixie Flatline (7/19/2010)
By "you", I assume that you're referring to SwedishOrr. The whole point of the code I wrote was to avoid hitting Table A twice.

Yes indeed ... the "you" is the generic individual trying to solve the problem as stated by the OP. I noticed that he was going out of his way to get the Col2 values and then join back to the table so I assumed this was necessary.

└> bt

Forum Etiquette: How to post data/code on a forum to get the best help[/url]

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 8

I'm not sure it is necessary because the col2 values come from Table A in the first place.

However, upon reflection, it is quite possible that other rows in Table A have those col2 values but do NOT have col1 values that would successfully join to Table B. So my solution may well return erroneous results.

Thanks for setting me straight. 🙂

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

SwedishOrr SSCrazy Points: 2955 More actions · Answer 9

Thanks again for the replies. I've attached sample data now. Also, I made a typo in the original post. The PK of Table A is Col1.

I really appreciate the ideas folks!

Thanks.