Problem with TOP in query

Question

Problem with TOP in query

Rob Reid-246754

SSCarpal Tunnel

Points: 4183
More actions
August 31, 2007 at 6:32 am

#180618

I have a jobboard site that on the home page has an include file that shows featured jobs. The site currently has a constant set that is used in the SQL that limits the no of jobs to show. The constant is currently set to 8 jobs. However we just had a problem where the no of actual featured jobs went below this figure to 7. This was causing DB timeouts when the application was trying to run the query. The JOB table that the data comes from currently has 246,000 records in it and is indexed correctly on a number of fields used in the SQL.
Someone in the office told me it was because of the TOP command and that beacause it couldn't find an 8th match it had to keep looking at all the records. However when I ran tests in Query Analsyser the recordset returns fine very quickly. I have tried a combination of the following:
DBCC
DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SELECT

top 100 --8, 7 it doesnt matter what no
JobPK,JobTitle,Location,
JobDescription
,dbo.udf_PAY_FORMAT('£',SalaryMin, SalaryMax, SalaryFrame, SalaryDesc) as Salary,
j
.HideContactDetails, c.CompanyNameFROM
JOBS
as j with (nolock)
JOIN

CLIENTS
as c with (nolock) ON j.ClientFK = c.ClientPK
WHERE

c
.SiteFK = @SITEID AND
j
.Live=1 AND
j
.featured = 1
ORDER BY
LiveDate DESC
And even with a TOP 100 it returns the 7 records almost instantly.
I have also tried using
SET ROWCOUNT 8
SELECT Job......
instead of the TOP and it works fine in QA but whenever the no of jobs falls beneath the amount to show the website is timing out.
I was going to rewrite the query to do a count first and then use that figure in the TOP if the result was < SHOW AMOUNT but from my tests in QA I shouldn't have to do any of that at all as it should work fine if there is less or more. I don't understand why the reported SQL is running fine in QA but not from the site. I have checked the Activity Console and there are no blocked/blocking processes and the rest of the site runs fine.
Does anyone have any ideas how I can resolve this or find out the cause of the timeout.
Thanks in advance.

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply

MarkusB SSC-Dedicated Points: 37370 More actions · Answer 1

Rob,

the query is fine and what your colleague told you is nonsense. I'm pretty sure that if you use Profiler and run a trace you will see the query completes within a couple of milliseconds.

I'm no programmer, just a simple DBA but I'm very sure that this is a application problem. Like the website won't display until it received 8 records. Check your asp code.

Markus

[font="Verdana"]Markus Bohse[/font]

Rob Reid-246754 SSCarpal Tunnel Points: 4183 More actions · Answer 2

Hello. The error is actually being returned on the line that executes the SQL

eg set objRS = objCon.Execute(strSQL)

And the error reported is a DB Timeout so it seems that it must be the SQL not the application.

The problem has just re-occurred. The no of featured jobs fell below 8 and now the home page of the site is hanging. As a quick fix we make the no of featured jobs 8 again and it loads perfectly. I know it seems odd but the error does seem to be SQL related. I need a permanent fix for this.

Thanks.

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 3

SELECT TOP 8

JobPK,

JobTitle,

Location,

JobDescription,

dbo.udf_PAY_FORMAT('£', SalaryMin, SalaryMax, SalaryFrame, SalaryDesc) as Salary,

j.HideContactDetails,

c.CompanyName

FROM JOBS AS j WITH (NOLOCK)

INNER JOIN CLIENTS AS c WITH (NOLOCK) ON c.ClientPK = j.ClientFK

WHERE c.SiteFK = @SITEID

AND j.Live = 1

AND j.Featured = 1

ORDER BY LiveDate DESC

This is your query as of now.

1) Do you have an index for LiveDate column?

2) Have you tried running the query without the UDF?

3) Try changing the order of table to JOIN and a OPTION (FORCE ORDER, MAXDOP 1) to the end of query

N 56°04'39.16"
E 12°55'05.25"

Rob Reid-246754 SSCarpal Tunnel Points: 4183 More actions · Answer 4

Hello thanks for your response but could you please explain point 3

"Try changing the order of table to JOIN and a OPTION (FORCE ORDER, MAXDOP 1) to the end of query"

as I can't work out what you mean. Are you saying to change the order of the join so that I am selecting from the CLIENT table first and then join to JOBS?

Thanks

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 5

1) If you say SELECT TOP 10 from a table with a kajillion records and only 9 meet your where clause, you bet that sql server will exhaustively complete it's given query plan to achieve 10 records in the output. And if statistics used by the query optimizer suggested that some index seek/bookmark lookup type query plan was most efficient to get to 10 records but THERE REALLY AREN'T 10, then it will index seek/bookmark lookup the ENTIRE TABLE SET in the query. That will obviously crush performance. To my knowledge there is nothing you can do about this. Even reducing the TOP to just ONE will still cause the same problem if it turns out there are ZERO rows that match your where clause.

2) I would check for blocking (sp_who2 active, look at the BlkBy column) when the query is hanging too. Also, break out profiler and get some information on query costs/times and blocking/locking there too.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Anders Pedersen SSChampion Points: 11410 More actions · Answer 6

Not sure I am buying what theSQLGuru is saying, but assuming that it is....

You could select a count of how many would fullfill your criteria into a variable first, then use 8 or the variable (whichever is smaller) as you top @var instead of hard coding it to 8.

That being said, I have never seen a problem with SQL performance just because there was not as many records in the result set as your X in the top.

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 7

Now - are you sure that this is processing SQL-side, and not web server side or client-side? You are encapsulating this in a stored proc, right?

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 8

Anders, your suggestion to count matching rows first IS useful under a few specific circumstances. First and foremost is that the count MUST be able to use indexing to cover the count. Actually can even be best to simply insert top N 1 as field1 into a 1 tinyint field table variable, where N is the number you would PREFER to use. This avoids having to scan all indexes to get a full count. Just use @@ROWCOUNT after that to see if you got N or fewer than N rows. Next is that if you are using sql 2005 you can use a variable for TOP @N clause. If sql 2000 then you are stuck with dynamic sql, although I would still use a hardcoded statement if your count/insert revealed at least as many as your preferred number.

Oh, and to prove that my original statement was correct, try this in the AdventureWorks database on SQL 2005:

set statistics io on

go

select top 10 *

from production.transactionhistory

where quantity > 25

select top 10 *

from production.transactionhistory

where quantity > 25000

On my laptop, first query takes 20 I/Os, second takes 792 I/Os (which is a full table scan, btw). Note that there are only 7 rows with quantity > 25000.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 9

So it's probably fair to say that it scans the table until it found enough matches? meaning - if you had 10 matches, but the 10 you're looking for happened to be the very last records - you'd see no difference (a.k.a. reads=792). I'm not sure how you could optimize for that?

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Anders Pedersen SSChampion Points: 11410 More actions · Answer 10

I stand corrected. Now to see if I can replicate that in our production environment although I don't think we have less records than the ones we want in the top value.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 11

The example I gave was not necessarily indicative of your situation. My example was a single table, filtered on a non-indexed column. IIRC, your OP had a query with a join and a where clause filtering both tables. That second situation can often be helped tremendously if the join column(s) as well as all where clause columns are indexed. That will allow index scans (or even seeks, depending on data distribution) to derive sufficient matching rows very quickly. Without fully appropriate indexing (possibly using included columns since 2005), the optimizer may well pick a nested loop query and wind up doing a kajillion page reads doing bookmark lookups and never hit that 8th row! Believe it or not you are often better off in that situation with NO indexes, since the optimizer would almost certainly use a hash or merge join and finish much more quickly.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Bob Fazio SSChampion Points: 10794 More actions · Answer 12

Are both Development and Production 2005? I have seen in large tables a big difference between the following and how they perform.

It seems like

set ROWCOUNT=8 is best solution for 2000

but select TOP(8) is best for 2005

Not sure if it was intentional or not, but "set ROWCOUNT" is depreciated in 2005 and does not work well. Also I would check your explain plans to verify that they are similar in both dev and PRD.

I agree that the issue is you are probably waiting on a table scan or blocking somewhere while searching for more records. Obviously when it finds 8 it returns, so what it appears to be doing is getting stuck at 7, but in reality it just hasn't found the 8th. It's not waiting for more.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 13

Bob, where did you read that SET ROWCOUNT was deprecated in 2005? I only found this statement in BOL: Using SET ROWCOUNT will not affect DELETE, INSERT, and UPDATE statements in the next release of SQL Server.

ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/c6966fb7-6421-47ef-98f3-82351f2f6bdc.htm

There was no mention of a change from 2000 to 2005 in how SET ROWCOUNT performs.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Anders Pedersen SSChampion Points: 11410 More actions · Answer 14

hmmm not sure the example from AdventureWorks is correct. If you do top 6 on the quantity > 25000 you get 616 logical reads. By your theory there should only be about 19 as long as the X in top is less than the total number of records returned by the select statement.

With a non-clustered index on the quantity column the logical reads goes to 23 for the quantity > 25000 query.