Shorten query by using Top predicate

Question

Shorten query by using Top predicate

pdanes

SSCrazy Eights

Points: 8504
More actions
August 22, 2013 at 8:30 am

#276397

I have situations when I only want to know if I have one record, or more than one record. For instance, a query like this:SELECT Top 2 AkcesAutoID FROM Podrobnosti Group By AkcesAutoIDI would like to stop as soon as it sees that it has a second record. But when I look at the query plan, the first step, an Index Scan passes (in this case) 85 rows, from an estimated 32,376 rows to the Stream Aggregate function, which then passes only two to the Top function, which then passes only two to the Select function. This is a simple query, but real ones will be considerably more complex, with multiple joins and conditionals.
The gist is that I want to know whether all the records matching my criteria have the same value in a particular field, in this case AkcesAutoID. If they are all the same, I can deal with the result set in a particular (simplified) way. If there are more, I don't (at this point) care how many more, I already know that the app has to do something a different way.
Reading on the subject indicates that the Top predicate is always applied only to the result set. Is there any way to tell the query engine that I want it to stop working on the problem AS SOON AS it has the second record that meets my criteria, instead of gathering them all and then discarding all but two?

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Sean Lange SSC Guru Points: 286609 More actions · Answer 1

No that is how it works. If you use the TOP predicate you generally should also use an order by clause. Without the order by there is no way of knowing which row(s) you will have returned.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

pdanes SSCrazy Eights Points: 8504 More actions · Answer 2

Sean Lange (8/22/2013)
No that is how it works. If you use the TOP predicate you generally should also use an order by clause. Without the order by there is no way of knowing which row(s) you will have returned.

That's my point, though. Did you read the details of my original post at all? In this situation, I DON'T CARE which rows are being returned, and in fact, I don't even want the rows. All I want is to know whether there is one row, or more than one row, and I want the engine to stop working on the problem as soon as it can answer that question, i.e, as soon as it sees a second row. EVERYTHING it does after discovering that there is another row besides the first one is completely wasted effort. Adding an Order By clause would simply result in even more pointless work.

I understand that the Top predicate is not the way to restrict the activity, but I'm wondering if there IS some way to do so.

HanShi SSC-Dedicated Points: 33522 More actions · Answer 3

Maybe you would be better off using:

SELECT

CASE (SELECT COUNT(1) FROM table WHERE...)

WHEN 0 THEN 'none'

WHEN 1 THEN 'single'

ELSE 'multiple'

END

and then using your simplified or complex query depending on the results.

** Don't mistake the ‘stupidity of the crowd’ for the ‘wisdom of the group’! **

Sean Lange SSC Guru Points: 286609 More actions · Answer 4

pdanes (8/22/2013)
Sean Lange (8/22/2013)
No that is how it works. If you use the TOP predicate you generally should also use an order by clause. Without the order by there is no way of knowing which row(s) you will have returned.
That's my point, though. Did you read the details of my original post at all? In this situation, I DON'T CARE which rows are being returned, and in fact, I don't even want the rows. All I want is to know whether there is one row, or more than one row, and I want the engine to stop working on the problem as soon as it can answer that question, i.e, as soon as it sees a second row. EVERYTHING it does after discovering that there is another row besides the first one is completely wasted effort. Adding an Order By clause would simply result in even more pointless work.
I understand that the Top predicate is not the way to restrict the activity, but I'm wondering if there IS some way to do so.

Are you always this snarky to people who are honestly trying to help? I tried to explain how TOP works. Maybe TOP is not the most efficient way to return the results you are looking for. I can't tell because your post was so vague. I hope you figure out a way to get what you after.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

TomThomson SSC Guru Points: 104768 More actions · Answer 5

Obviously there's no way to avoid a complete scan when all the values are the same.

It's possible to have something which takes quadratic time when all the values are the same, but it does stop as soon as a second value is hit:-

select case

when not exists (select 1 from MyTable A, MyTable B where A.MyField <> B.MyField)

then 1

else 0

end as singleValue -- 1 = true, 0 = false

Of course the performance cost in the single value case is large; but so is the cost of the sort to do your group by operation, so this may be better.

Tom

pietlinden SSC Guru Points: 63632 More actions · Answer 6

Why not use EXISTS and something like t1.FK<>t2.FK to check for the existence of two records?

By definition, EXISTS stops as soon as it evaluates to True.

Sean's right - honey beats vinegar every time.

pdanes SSCrazy Eights Points: 8504 More actions · Answer 7

L' Eomot Inversé (8/22/2013)
Obviously there's no way to avoid a complete scan when all the values are the same.
It's possible to have something which takes quadratic time when all the values are the same, but it does stop as soon as a second value is hit:-
select case
when not exists (select 1 from MyTable A, MyTable B where A.MyField <> B.MyField)
then 1
else 0
end as singleValue -- 1 = true, 0 = false
Of course the performance cost in the single value case is large; but so is the cost of the sort to do your group by operation, so this may be better.

Yes, when all values are the same, of course, there is no way to know that until all values have been examined. This looks promising - thank you. I don't think I ever would have thought of this on my own. Just tried it, and the number of rows passed from Index Scan B is 60 and Index Sacn A is 1, so it's already better than the 85 I had initially. Some pointless activity by the engine is probably unavoidable, but this seems to be the way to go.

pdanes SSCrazy Eights Points: 8504 More actions · Answer 8

Sean Lange (8/22/2013)
Are you always this snarky to people who are honestly trying to help? I tried to explain how TOP works. Maybe TOP is not the most efficient way to return the results you are looking for. I can't tell because your post was so vague. I hope you figure out a way to get what you after.

What snarky? Because I asked if you read the details of my original post? Sorry, but your answer gave the impression that you hadn't.

My post stated I had read that Top worked on only result sets, and I asked if there was a way (another way) to curtail processing. You didn't explain anything, just told me that that's how it works, which I had stated that I already knew.

I made a point of stating that I'm trying to REDUCE processing time, and that I am only interested in knowing whether or not I have more than one record, and I want to STOP processing as soon as I discover that. You responded by telling me to add a sort to a dataset which I had already stated I was trying to avoid generating in the first place.

Apologies if I sounded curt, but I do sometimes get answers from which it's completely clear that the responder in fact DIDN'T read the post, and is only reacting to a few keywords that he noticed while giving my post a cursory glance. The most common example is "Post your SQL", when I state right in the original question that I have no SQL yet, that I'm asking a question about design principles.

Sean Lange SSC Guru Points: 286609 More actions · Answer 9

pdanes (8/22/2013)
Sean Lange (8/22/2013)
Are you always this snarky to people who are honestly trying to help? I tried to explain how TOP works. Maybe TOP is not the most efficient way to return the results you are looking for. I can't tell because your post was so vague. I hope you figure out a way to get what you after.
What snarky? Because I asked if you read the details of my original post? Sorry, but your answer gave the impression that you hadn't.
My post stated I had read that Top worked on only result sets, and I asked if there was a way (another way) to curtail processing. You didn't explain anything, just told me that that's how it works, which I had stated that I already knew.
I made a point of stating that I'm trying to REDUCE processing time, and that I am only interested in knowing whether or not I have more than one record, and I want to STOP processing as soon as I discover that. You responded by telling me to add a sort to a dataset which I had already stated I was trying to avoid generating in the first place.
Apologies if I sounded curt, but I do sometimes get answers from which it's completely clear that the responder in fact DIDN'T read the post, and is only reacting to a few keywords that he noticed while giving my post a cursory glance. The most common example is "Post your SQL", when I state right in the original question that I have no SQL yet, that I'm asking a question about design principles.

We will just have to chalk it up to a miscommunication. After seeing the details in the couple following posts I would agree that using EXISTS is likely a better alternative for what you are trying to accomplish.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/