How to improve a query with a like statement

Question

Post reply

How to improve a query with a like statement

ricardo_chicas

SSCertifiable

Points: 5024
More actions
April 28, 2010 at 4:47 pm

#232130

Hi folks,
I have this query that isn´t running nicely
select
a.column2,
a.column4,
b.column5,
a.column6
from
dbo.table1 a
left join dbo.table2 c ON a.key_column = c.key_column
left join dbo.table3 d on a.key_column = d.key_column
left join dbo.table4 b ON a.key_column = b.key_column
where
a.key_column like 'abcdef00001%'
and a.column7 <> '0'
The problem is that the like statemen is ruinning the execution time, it takes about two hours since every table has almost 15 millions rows (Index is by key_columns)
Is there any way to fix it so the query runs faster?
I already reorganize indexes, update statistics, when I check the Estimated Execution Plan I can see the it makes a Index scan...
Please help
Thanks in advance

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply

LutzM SSC Guru Points: 107049 More actions · Answer 1

First question:

Why do you have table2 and table3 in your query? They are neither part of your SELECT nor your WHERE condition....

To help you any further please post more detailed information as described in this article[/url].

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

ricardo_chicas SSCertifiable Points: 5024 More actions · Answer 2

I simplify the query a lot, I need the others tables, but with just the one with the a.column_key like ....., is enough to have a really poor performance

LutzM SSC Guru Points: 107049 More actions · Answer 3

ricardo_chicas (4/28/2010)
I simplify the query a lot, I need the others tables, but with just the one with the a.column_key like ....., is enough to have a really poor performance

Most probably an issue of missing / non-covering indexes. The LIKE statement itself shouldn't be that bad since the wildcard is at the end... But that's only guessing. We need te data as described in the article I pointed you at: Table def, index def, actual (not estimated) execution plan to have at least something to start with.

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

ricardo_chicas SSCertifiable Points: 5024 More actions · Answer 4

even simplier:

select

a.column2,

a.column4,

a.column6

from

dbo.table1 a

where

a.column like 'abcdef00001%'

and a.column7 <> '0'

I attached the plan

Thanks

Grant Fritchey SSC Guru Points: 398886 More actions · Answer 5

Presumably, with an index on the column and the LIKE operator only have a wild card at the end of the string, you have a sargeable (search argument able) query. So, the question then turns to the index itself. How selective is it? For example with the execution plan you posted (by the way, posting it as .sqlplan would make things easier) it shows an estimated number of rows as 12,493,700. Out of how many? If it's a substantial portion of the table, the query has no choice but to scan the whole set. There's no "fast" way to move 12 million rows. It's completely dependent on the speed of your disk and the speed of your network.

Why would you have a query return 12 million rows?

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Stefan_G SSCertifiable Points: 6609 More actions · Answer 6

ricardo_chicas (4/28/2010)
even simplier:
select
a.column2,
a.column4,
a.column6
from
dbo.table1 a
where
a.column like 'abcdef00001%'
and a.column7 <> '0'

How many rows are in the table ?

How many rows does the query above really return ?

What do your indexes look like ?

/SG

ricardo_chicas SSCertifiable Points: 5024 More actions · Answer 7

How many rows are in the table ?

A\ 70 million rows

How many rows does the query above really return ?

A\query returns about 70 thousand

What do your indexes look like ?

CONSTRAINT [column_key:PrimaryKey] PRIMARY KEY CLUSTERED

(

[FB_ID] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [Tables1]

) ON [Tables1]

Thanks

ricardo_chicas SSCertifiable Points: 5024 More actions · Answer 8

Presumably, with an index on the column and the LIKE operator only have a wild card at the end of the string, you have a sargeable (search argument able) query. So, the question then turns to the index itself. How selective is it? For example with the execution plan you posted (by the way, posting it as .sqlplan would make things easier) it shows an estimated number of rows as 12,493,700. Out of how many? If it's a substantial portion of the table, the query has no choice but to scan the whole set. There's no "fast" way to move 12 million rows. It's completely dependent on the speed of your disk and the speed of your network.

Why would you have a query return 12 million rows?

As I say, I simplified the query ( thas the reason for the 12 million..., still with all the conditions the result is of about 70 thousand rows, and the index scan remains ).

Grant Fritchey SSC Guru Points: 398886 More actions · Answer 9

ricardo_chicas (4/29/2010)
Presumably, with an index on the column and the LIKE operator only have a wild card at the end of the string, you have a sargeable (search argument able) query. So, the question then turns to the index itself. How selective is it? For example with the execution plan you posted (by the way, posting it as .sqlplan would make things easier) it shows an estimated number of rows as 12,493,700. Out of how many? If it's a substantial portion of the table, the query has no choice but to scan the whole set. There's no "fast" way to move 12 million rows. It's completely dependent on the speed of your disk and the speed of your network.
Why would you have a query return 12 million rows?
As I say, I simplified the query ( thas the reason for the 12 million..., still with all the conditions the result is of about 70 thousand rows, and the index scan remains ).

70 thousand rows still sounds like it's likely to choose a scan. It really depends on how selective the index is. Assuming the data is perfectly distributed, you're at .001 selectivity. That ought to result in a seek, but not seeing the statistics available, it's hard to know.

Also, the OR caused by the <> can lead to scans too.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

ricardo_chicas SSCertifiable Points: 5024 More actions · Answer 10

Just for you to know ( maybe it would help) that column is the primary key, clustered and is a varchar(23), I and checking the index fragmentation (0.928468598)

The statistics of the index are:

Name: column_key:PrimaryKey

Updated: Apr 23 2010 3:04AM

Rows: 68572280

Rows Sampled:311957

Steps: 179

Density: 1

Averagekey: 23

String Index: Yes

All Density: 1.458315E-08

Average Length 23

Range_Rows EQ_ROWS Distinct_Range_Rows AVG_Range_Rows

0 1 0 1

268765.3 1 268765 1

450214.9 1 450178 1.000082

224997.5 1 224997 1

all the rest are the similar

Thanks

Grant Fritchey SSC Guru Points: 398886 More actions · Answer 11

Well, based on the information provided, I'd expect to see a seek, so I have to assume that some piece of information is missing. Without seeing any of the actual behaviors it's just too hard to know. The example query is off by a factor.

When you rebuilt the stats on the tables involved, did you use sp_updatestats or did you do a full scan on the statistics?

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Bhuvnesh SSC Guru Points: 59351 More actions · Answer 12

You can use NOLOCK hint ( Beware of dirty reads) ,if possible.

-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

ricardo_chicas SSCertifiable Points: 5024 More actions · Answer 13

Yes I use sp_updatestats, The query is actually the same I am running ( just with different names)--something weird us that if I add one more number to the like contition it turns into a seek.....

Btw, about the <> '0' condition, if I convert that into a abs(column) > 0 the performance improves greatly, is ok to do that, right?

Bhuvnesh SSC Guru Points: 59351 More actions · Answer 14

ricardo_chicas (4/29/2010)
I convert that into a abs(column) = 0 the performance improves greatly, is ok to do that, right?

i dont think so , it wil force sql optimizer to use seek operation on index. using function in where caluse column is not good

-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)