Two nearly identical selects, very different performance. Why?

Question

Post reply

Two nearly identical selects, very different performance. Why?

George-249934

SSC Journeyman

Points: 92
More actions
September 25, 2006 at 11:47 am

#120822

Here is a fun one. 2 selects like so...
select t1.f1, t1.f2, t1.f3 from t1
inner join t2 on t1.f1 = t2.f2
left outer join t3 on t1.f1 = t3.f2
where t1.f2 = 'blah' and t3.f1 is null
This query Explains almost ideally, 3 index seeks , bookmark lookup, 2 nested loops, filter, done. Each tables contains ~20 million records.
Now, replace the list of fields in the Select clause with * (splat) and it still Explains well (identically except for an extra bookmark lookup before the filter to get all the field data (*), as one would expect). 3 index seeks , bookmark lookup, 2 nested loops, extra bookmark lookup, filter, done.
When we execute them, the * version works and returns the expected result in milliseconds , but the one with explicit fields completes in 5minutes. That is longer than 3 table scans! I dont get it. Shouldn't the one with the explicit field list run better?
Note: indexes and stats are up to date and clean.

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 1

Have you dropped the execution plans before each run?

George-249934 SSC Journeyman Points: 92 More actions · Answer 2

Not explicitly. I tested them in separate sessions. May I ask for syntax to do that?

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 3

Ninja's_RGR'us

SSC Guru

Points: 294069

September 25, 2006 at 11:59 am

#662335

DBCC FREEPROCCACHE

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 4

You might also want to check this out : DBCC DROPCLEANBUFFERS

Those are not to be used on prod environement unless you realllllllllllllly have to .

George-249934 SSC Journeyman Points: 92 More actions · Answer 5

Okay. I did both options preceding each query, with the same basic results. The version with explicit fields took 10 min, while the other too seconds to 2 minutes.

These queries, incidentally, both return zero results. We are going to force data to get some rows back and see if that makes a diff.

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 6

Lol, those complaints will subside when everything gets put back into ram and recompiled. But that can take a while in a test environement .

Let us know how it goes after you get data back.

Also can you tell us how the poucentages differ from one execution plan to the other.... maybe the problem is not the extra step, but a step that takes 3-4 times as long as it should.

George-249934 SSC Journeyman Points: 92 More actions · Answer 7

Even more !

The entire execution time subtree cost for * is .0130; for the distinct it is 0.0123. Yet the relative execution times are 360 seconds apart!

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 8

Do you have large text columns or blobs (not even sure how they work) that could explain this?

Or large fragmentation (on the disk).

George-249934 SSC Journeyman Points: 92 More actions · Answer 9

So... I am not crazy. Okay, well, at least on this subject.

Disk fragmentation is low, thanks to my buddies in Ops, and all the fields being retrieved are from Table1 of the query, and they are bigint, varchar(10), varchar(10), tinyint.

But even fragmentation or blobs would not explain why * goes orders of magnitude faster than explicit fields.

If we list even 1 field explicitly the query goes long. If we list fields and put a * at the end of the list it screams. Wacky! Unfortunately this is a select feeding an INSERT so we have to use explicit fields. Wow, we use this sort of technique all the time and it doesnt go this way. Something about one of these tables has got to be bad. I will post back to here... it is getting personal now. Must conquer MS SQL.

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 10

Lol, good luck. In the mean time maybe someone else will have something else to propose.

ctrucza Grasshopper Points: 10 More actions · Answer 11

Hmmm... Weird.

Maybe you could look at/post the actual execution plans for both queries?

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 12

From the original thread :

where t1.f2 = 'blah' and t3.f1 is null

This query Explains almost ideally, 3 index seeks , bookmark lookup, 2 nested loops, filter, done. Each tables contains ~20 million records.

Now, replace the list of fields in the Select clause with * (splat) and it still Explains well (identically except for an extra bookmark lookup before the filter to get all the field data (*), as one would expect). 3 index seeks , bookmark lookup, 2 nested loops, extra bookmark lookup, filter, done.

George-249934 SSC Journeyman Points: 92 More actions · Answer 13

🙂 Hi. Back again. The problem has gone away, but it doesnt make sense.

To remind you all, and me, the query was either "select [specific fields]" or "select *" and the execution plans for both were virtually identical (an extra book mark lookup as one would expect for the * version). But the performance for the [specific fields] was close to 10 minutes, while the [*] version was nearly instantaneous.

Well, as a matter of database maintenance, the indexes were rebuilt, and suddenly the "select [specific fields]" version is working "correctly" ... meaning returning results at least as quickly as the [*] version. The execution plans have not changed one iota.

Okay, so the index was probably severely fragmented. I shoulda checked that. But, I still dont understand why select [specific fields] and select * would perform sooooooo differently -- favoring the * -- they should have performed equally poorly. Does anyone have any hints?

I will chock it up to "the great Microsoft whatever" unless someone can logically explain it away.