How to improve a query with a like statement

  • Hi folks,

    I have this query that isn´t running nicely

    select

    a.column2,

    a.column4,

    b.column5,

    a.column6

    from

    dbo.table1 a

    left join dbo.table2 c ON a.key_column = c.key_column

    left join dbo.table3 d on a.key_column = d.key_column

    left join dbo.table4 b ON a.key_column = b.key_column

    where

    a.key_column like 'abcdef00001%'

    and a.column7 <> '0'

    The problem is that the like statemen is ruinning the execution time, it takes about two hours since every table has almost 15 millions rows (Index is by key_columns)

    Is there any way to fix it so the query runs faster?

    I already reorganize indexes, update statistics, when I check the Estimated Execution Plan I can see the it makes a Index scan...

    Please help

    Thanks in advance

  • First question:

    Why do you have table2 and table3 in your query? They are neither part of your SELECT nor your WHERE condition....

    To help you any further please post more detailed information as described in this article[/url].



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • I simplify the query a lot, I need the others tables, but with just the one with the a.column_key like ....., is enough to have a really poor performance

  • ricardo_chicas (4/28/2010)


    I simplify the query a lot, I need the others tables, but with just the one with the a.column_key like ....., is enough to have a really poor performance

    Most probably an issue of missing / non-covering indexes. The LIKE statement itself shouldn't be that bad since the wildcard is at the end... But that's only guessing. We need te data as described in the article I pointed you at: Table def, index def, actual (not estimated) execution plan to have at least something to start with.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • even simplier:

    select

    a.column2,

    a.column4,

    a.column6

    from

    dbo.table1 a

    where

    a.column like 'abcdef00001%'

    and a.column7 <> '0'

    I attached the plan

    Thanks

  • Presumably, with an index on the column and the LIKE operator only have a wild card at the end of the string, you have a sargeable (search argument able) query. So, the question then turns to the index itself. How selective is it? For example with the execution plan you posted (by the way, posting it as .sqlplan would make things easier) it shows an estimated number of rows as 12,493,700. Out of how many? If it's a substantial portion of the table, the query has no choice but to scan the whole set. There's no "fast" way to move 12 million rows. It's completely dependent on the speed of your disk and the speed of your network.

    Why would you have a query return 12 million rows?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • ricardo_chicas (4/28/2010)


    even simplier:

    select

    a.column2,

    a.column4,

    a.column6

    from

    dbo.table1 a

    where

    a.column like 'abcdef00001%'

    and a.column7 <> '0'

    How many rows are in the table ?

    How many rows does the query above really return ?

    What do your indexes look like ?

    /SG

  • How many rows are in the table ?

    A\ 70 million rows

    How many rows does the query above really return ?

    A\query returns about 70 thousand

    What do your indexes look like ?

    CONSTRAINT [column_key:PrimaryKey] PRIMARY KEY CLUSTERED

    (

    [FB_ID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [Tables1]

    ) ON [Tables1]

    Thanks

  • Presumably, with an index on the column and the LIKE operator only have a wild card at the end of the string, you have a sargeable (search argument able) query. So, the question then turns to the index itself. How selective is it? For example with the execution plan you posted (by the way, posting it as .sqlplan would make things easier) it shows an estimated number of rows as 12,493,700. Out of how many? If it's a substantial portion of the table, the query has no choice but to scan the whole set. There's no "fast" way to move 12 million rows. It's completely dependent on the speed of your disk and the speed of your network.

    Why would you have a query return 12 million rows?

    As I say, I simplified the query ( thas the reason for the 12 million..., still with all the conditions the result is of about 70 thousand rows, and the index scan remains ).

  • ricardo_chicas (4/29/2010)


    Presumably, with an index on the column and the LIKE operator only have a wild card at the end of the string, you have a sargeable (search argument able) query. So, the question then turns to the index itself. How selective is it? For example with the execution plan you posted (by the way, posting it as .sqlplan would make things easier) it shows an estimated number of rows as 12,493,700. Out of how many? If it's a substantial portion of the table, the query has no choice but to scan the whole set. There's no "fast" way to move 12 million rows. It's completely dependent on the speed of your disk and the speed of your network.

    Why would you have a query return 12 million rows?

    As I say, I simplified the query ( thas the reason for the 12 million..., still with all the conditions the result is of about 70 thousand rows, and the index scan remains ).

    70 thousand rows still sounds like it's likely to choose a scan. It really depends on how selective the index is. Assuming the data is perfectly distributed, you're at .001 selectivity. That ought to result in a seek, but not seeing the statistics available, it's hard to know.

    Also, the OR caused by the <> can lead to scans too.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Just for you to know ( maybe it would help) that column is the primary key, clustered and is a varchar(23), I and checking the index fragmentation (0.928468598)

    The statistics of the index are:

    Name: column_key:PrimaryKey

    Updated: Apr 23 2010 3:04AM

    Rows: 68572280

    Rows Sampled:311957

    Steps: 179

    Density: 1

    Averagekey: 23

    String Index: Yes

    All Density: 1.458315E-08

    Average Length 23

    Range_Rows EQ_ROWS Distinct_Range_Rows AVG_Range_Rows

    0 1 0 1

    268765.3 1 268765 1

    450214.9 1 450178 1.000082

    224997.5 1 224997 1

    all the rest are the similar

    Thanks

  • Well, based on the information provided, I'd expect to see a seek, so I have to assume that some piece of information is missing. Without seeing any of the actual behaviors it's just too hard to know. The example query is off by a factor.

    When you rebuilt the stats on the tables involved, did you use sp_updatestats or did you do a full scan on the statistics?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • You can use NOLOCK hint ( Beware of dirty reads) ,if possible.

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Yes I use sp_updatestats, The query is actually the same I am running ( just with different names)--something weird us that if I add one more number to the like contition it turns into a seek.....

    Btw, about the <> '0' condition, if I convert that into a abs(column) > 0 the performance improves greatly, is ok to do that, right?

  • ricardo_chicas (4/29/2010)


    I convert that into a abs(column) = 0 the performance improves greatly, is ok to do that, right?

    i dont think so , it wil force sql optimizer to use seek operation on index. using function in where caluse column is not good

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply