November 10, 2018 at 4:26 am
AdrianLParker+sqlservercentral.com - Friday, November 9, 2018 8:26 PMI realize what those queries can return. Despite intuitively knowing that the first can only return a single row, does the SQL engine know that and short circuit on the query?
Yes.
Firstly, indexes are ordered by their key, so even if the index on that column wasn't unique, as soon as the first value that's greater than 3 is encountered, the storage engine knows that there cannot be more rows with the value of 3. The index is ordered (if you had a telephone directory, you know that you cannot find a surname of Brown anywhere after Brust as an example)
If the index is unique, then there can only be 0 or 1 rows with the value of 3, so as soon as either 3 is found, or a value greater than 3 is found, all valid rows are found and no more searching is done.
And this is trivial to test. Run the query, look at the number of logical reads and, from the execution plan, the number of rows read.
Please don't go dropping TOP all over the place when it's not required. TOP(n) is for when you only want n of the qualifying rows, of which there are more than n. You're just making your queries harder to read and understand and confusing everyone that has to maintain or look at it later
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 10, 2018 at 10:26 am
Lynn Pettis - Friday, November 9, 2018 9:43 PMAdrianLParker+sqlservercentral.com - Friday, November 9, 2018 8:26 PMLynn Pettis - Friday, November 9, 2018 3:14 PMAdrianLParker+sqlservercentral.com - Friday, November 9, 2018 2:59 PM@Lynn Pettis
Although it has a unique constraint, otherwise it's just stored as any other index is, is it not? Does the SELECT statement differentiate between a where clause against a unique field (be it primary key or other unique index) or a unique field?For example, does "where somePrimaryColumn = 3" short circuit when it finds the first matching record and "where somePrimaryColumn >= 3" does not? Is the engine that clever?
There is ONLY one matching row of data when the where clause is WHERE somePrimaryColumn = 3. The second where clause you propose is looking for 1 or more rows of data WHERE comePrimaryColumn >= 3. They are not the same query.
I realize what those queries can return. Despite intuitively knowing that the first can only return a single row, does the SQL engine know that and short circuit on the query?
Does the SQL engine treat the first query any differently than the second? Or, in the first does it still scan all indices to find matches (despite the fact that you and I know that it can only return one row)?
So, the database engine finds the one and only row of data with the given primary key value but it is going to continue checking the other 100,000,000 rows of data just to be sure that there isn't another one sitting around some where. Right. Equality condition, unique value (ONLY ONE if it exists), it finds it or it doesn't. It isn't going to keep searching once it has found the one value it was looking for.
Consider this query: where somePrimaryKey = 3 or anyColumn > 6
As a developer I'd be a little surprised if the special condition, "if the where clause references only a unique key AND only uses the equality operator", was added.
I'm not a DBA though.
November 10, 2018 at 10:28 am
GilaMonster - Saturday, November 10, 2018 4:26 AMAdrianLParker+sqlservercentral.com - Friday, November 9, 2018 8:26 PMI realize what those queries can return. Despite intuitively knowing that the first can only return a single row, does the SQL engine know that and short circuit on the query?Please don't go dropping TOP all over the place when it's not required. TOP(n) is for when you only want n of the qualifying rows, of which there are more than n. You're just making your queries harder to read and understand and confusing everyone that has to maintain or look at it later
If you'll read back, I was opposed to using top.
November 10, 2018 at 12:13 pm
AdrianLParker+sqlservercentral.com - Saturday, November 10, 2018 10:26 AMAs a developer I'd be a little surprised if the special condition, "if the where clause references only a unique key AND only uses the equality operator", was added.I'm not a DBA though.
It's not that there's a million rules, it's that based on the data and the index, there cannot possibly be more rows anywhere, and hence there's no need to keep looking.
Now the OR is going to either be an index scan, or two index seeks, one on each predicate, depending on what indexes you have. If the latter, then the seek on somePrimaryKey will stop once it finds a matching row (or finds any row > 3) and the seek on anyColumn will start at 6 and read until the end of the index, since that's an unbounded inequality.
If you play around with the execution plans, you'll see exactly how these queries operate.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 10, 2018 at 2:48 pm
GilaMonster - Saturday, November 10, 2018 12:13 PMAdrianLParker+sqlservercentral.com - Saturday, November 10, 2018 10:26 AMAs a developer I'd be a little surprised if the special condition, "if the where clause references only a unique key AND only uses the equality operator", was added.I'm not a DBA though.
there cannot possibly be more rows anywhere
Right, and the computers needs algorithms to determine the same (it doesn't have intuition right). My question revolves around how generic those rules are. I don't think you really understand what I'm asking.
November 10, 2018 at 3:32 pm
I'm sorry you haven't understood my explanations.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 11, 2018 at 1:43 pm
AdrianLParker+sqlservercentral.com - Friday, November 9, 2018 10:12 AMThe MSDN says of OPTIONA (FAST X):Specifies that the query is optimized for fast retrieval of the first number_rows. This is a nonnegative integer. After the first number_rows are returned, the query continues execution and produces its full result set.
...so if it scans the entire table anyway, does it really do anything since you can't see the results until the database server returns them all?
Nope. That's not what it's saying. It says that it produces the "first number_rows" and continues to return the rest of the result set. There's nothing in there that says a scan occurs. That's up to the criteria in the query. If you're looking at a PK for an equality comparison, there will be no scan. It will be a seek because PKs are inherently unique.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 11, 2018 at 3:15 pm
Jeff Moden - Sunday, November 11, 2018 1:43 PMAdrianLParker+sqlservercentral.com - Friday, November 9, 2018 10:12 AMThe MSDN says of OPTIONA (FAST X):Specifies that the query is optimized for fast retrieval of the first number_rows. This is a nonnegative integer. After the first number_rows are returned, the query continues execution and produces its full result set.
...so if it scans the entire table anyway, does it really do anything since you can't see the results until the database server returns them all?
Nope. That's not what it's saying. It says that it produces the "first number_rows" and continues to return the rest of the result set. There's nothing in there that says a scan occurs. That's up to the criteria in the query. If you're looking at a PK for an equality comparison, there will be no scan. It will be a seek because PKs are inherently unique.
you took me far too literally when I said "scan".
November 11, 2018 at 3:46 pm
AdrianLParker+sqlservercentral.com - Sunday, November 11, 2018 3:15 PMJeff Moden - Sunday, November 11, 2018 1:43 PMAdrianLParker+sqlservercentral.com - Friday, November 9, 2018 10:12 AMThe MSDN says of OPTIONA (FAST X):Specifies that the query is optimized for fast retrieval of the first number_rows. This is a nonnegative integer. After the first number_rows are returned, the query continues execution and produces its full result set.
...so if it scans the entire table anyway, does it really do anything since you can't see the results until the database server returns them all?
Nope. That's not what it's saying. It says that it produces the "first number_rows" and continues to return the rest of the result set. There's nothing in there that says a scan occurs. That's up to the criteria in the query. If you're looking at a PK for an equality comparison, there will be no scan. It will be a seek because PKs are inherently unique.
you took me far too literally when I said "scan".
I don't believe so... you very specifically stated "so if it scans the entire table anyway". Even if you didn't mean it that way, read it as if a rank newbie were reading it. They would be led astray.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 11, 2018 at 5:24 pm
AdrianLParker+sqlservercentral.com - Saturday, November 10, 2018 10:26 AMLynn Pettis - Friday, November 9, 2018 9:43 PMAdrianLParker+sqlservercentral.com - Friday, November 9, 2018 8:26 PMLynn Pettis - Friday, November 9, 2018 3:14 PMAdrianLParker+sqlservercentral.com - Friday, November 9, 2018 2:59 PM@Lynn Pettis
Although it has a unique constraint, otherwise it's just stored as any other index is, is it not? Does the SELECT statement differentiate between a where clause against a unique field (be it primary key or other unique index) or a unique field?For example, does "where somePrimaryColumn = 3" short circuit when it finds the first matching record and "where somePrimaryColumn >= 3" does not? Is the engine that clever?
There is ONLY one matching row of data when the where clause is WHERE somePrimaryColumn = 3. The second where clause you propose is looking for 1 or more rows of data WHERE comePrimaryColumn >= 3. They are not the same query.
I realize what those queries can return. Despite intuitively knowing that the first can only return a single row, does the SQL engine know that and short circuit on the query?
Does the SQL engine treat the first query any differently than the second? Or, in the first does it still scan all indices to find matches (despite the fact that you and I know that it can only return one row)?
So, the database engine finds the one and only row of data with the given primary key value but it is going to continue checking the other 100,000,000 rows of data just to be sure that there isn't another one sitting around some where. Right. Equality condition, unique value (ONLY ONE if it exists), it finds it or it doesn't. It isn't going to keep searching once it has found the one value it was looking for.
Consider this query: where somePrimaryKey = 3 or anyColumn > 6
As a developer I'd be a little surprised if the special condition, "if the where clause references only a unique key AND only uses the equality operator", was added.
I'm not a DBA though.
And that query is basically this:
select … from … where somePrimaryKey = 3
union
select … from … where anyColumn > 6;
Why are you throwing out all these irrelevant WHERE clause possibilities?
November 11, 2018 at 5:53 pm
Lynn Pettis - Sunday, November 11, 2018 5:24 PMAdrianLParker+sqlservercentral.com - Saturday, November 10, 2018 10:26 AMLynn Pettis - Friday, November 9, 2018 9:43 PMAdrianLParker+sqlservercentral.com - Friday, November 9, 2018 8:26 PMLynn Pettis - Friday, November 9, 2018 3:14 PMAdrianLParker+sqlservercentral.com - Friday, November 9, 2018 2:59 PM@Lynn Pettis
Although it has a unique constraint, otherwise it's just stored as any other index is, is it not? Does the SELECT statement differentiate between a where clause against a unique field (be it primary key or other unique index) or a unique field?For example, does "where somePrimaryColumn = 3" short circuit when it finds the first matching record and "where somePrimaryColumn >= 3" does not? Is the engine that clever?
There is ONLY one matching row of data when the where clause is WHERE somePrimaryColumn = 3. The second where clause you propose is looking for 1 or more rows of data WHERE comePrimaryColumn >= 3. They are not the same query.
I realize what those queries can return. Despite intuitively knowing that the first can only return a single row, does the SQL engine know that and short circuit on the query?
Does the SQL engine treat the first query any differently than the second? Or, in the first does it still scan all indices to find matches (despite the fact that you and I know that it can only return one row)?
So, the database engine finds the one and only row of data with the given primary key value but it is going to continue checking the other 100,000,000 rows of data just to be sure that there isn't another one sitting around some where. Right. Equality condition, unique value (ONLY ONE if it exists), it finds it or it doesn't. It isn't going to keep searching once it has found the one value it was looking for.
Consider this query: where somePrimaryKey = 3 or anyColumn > 6
As a developer I'd be a little surprised if the special condition, "if the where clause references only a unique key AND only uses the equality operator", was added.
I'm not a DBA though.
And that query is basically this:
select … from … where somePrimaryKey = 3
union
select … from … where anyColumn > 6;
Why are you throwing out all these irrelevant WHERE clause possibilities?
That seems very inefficient. Wouldn't it first check the unique indexes and only if that returns nothing, then check the others? That would be faster right? My very point is I don't know how exactly internally the queries are optimized.
November 11, 2018 at 7:27 pm
AdrianLParker+sqlservercentral.com - Sunday, November 11, 2018 5:53 PMLynn Pettis - Sunday, November 11, 2018 5:24 PMAdrianLParker+sqlservercentral.com - Saturday, November 10, 2018 10:26 AMLynn Pettis - Friday, November 9, 2018 9:43 PMAdrianLParker+sqlservercentral.com - Friday, November 9, 2018 8:26 PMLynn Pettis - Friday, November 9, 2018 3:14 PMAdrianLParker+sqlservercentral.com - Friday, November 9, 2018 2:59 PM@Lynn Pettis
Although it has a unique constraint, otherwise it's just stored as any other index is, is it not? Does the SELECT statement differentiate between a where clause against a unique field (be it primary key or other unique index) or a unique field?For example, does "where somePrimaryColumn = 3" short circuit when it finds the first matching record and "where somePrimaryColumn >= 3" does not? Is the engine that clever?
There is ONLY one matching row of data when the where clause is WHERE somePrimaryColumn = 3. The second where clause you propose is looking for 1 or more rows of data WHERE comePrimaryColumn >= 3. They are not the same query.
I realize what those queries can return. Despite intuitively knowing that the first can only return a single row, does the SQL engine know that and short circuit on the query?
Does the SQL engine treat the first query any differently than the second? Or, in the first does it still scan all indices to find matches (despite the fact that you and I know that it can only return one row)?
So, the database engine finds the one and only row of data with the given primary key value but it is going to continue checking the other 100,000,000 rows of data just to be sure that there isn't another one sitting around some where. Right. Equality condition, unique value (ONLY ONE if it exists), it finds it or it doesn't. It isn't going to keep searching once it has found the one value it was looking for.
Consider this query: where somePrimaryKey = 3 or anyColumn > 6
As a developer I'd be a little surprised if the special condition, "if the where clause references only a unique key AND only uses the equality operator", was added.
I'm not a DBA though.
And that query is basically this:
select … from … where somePrimaryKey = 3
union
select … from … where anyColumn > 6;
Why are you throwing out all these irrelevant WHERE clause possibilities?That seems very inefficient. Wouldn't it first check the unique indexes and only if that returns nothing, then check the others? That would be faster right? My very point is I don't know how exactly internally the queries are optimized.
It is an OR. That means it is true if either or both side are true. The query would return a row of data if somePrimaryKey = 3, it also will return all rows of data where anyColumn > 6.
November 12, 2018 at 6:34 am
Lynn Pettis - Sunday, November 11, 2018 7:27 PMAdrianLParker+sqlservercentral.com - Sunday, November 11, 2018 5:53 PMIt is an OR. That means it is true if either or both side are true. The query would return a row of data if somePrimaryKey = 3, it also will return all rows of data where anyColumn > 6.
But it's quicker to check if somePrimaryKey = 3 then it is to check the anyColumn > 6 isn't it? If the indexed column satisfied there where condition, it would be quicker to check it first right?
In the query given the UNION would evaluate both queries every time? If so, then the two queries are not equivalent in execution times.
November 12, 2018 at 6:37 am
Jeff Moden - Sunday, November 11, 2018 3:46 PMAdrianLParker+sqlservercentral.com - Sunday, November 11, 2018 3:15 PMyou took me far too literally when I said "scan".I don't believe so... you very specifically stated "so if it scans the entire table anyway". Even if you didn't mean it that way, read it as if a rank newbie were reading it. They would be led astray.
I'm a computer programmer, not a DBA. Yes I specifically used some words, but those words have different meaning to me than they do to you obviously. And until I know which words I can and can't use to describe a situation, how can I know which words I can and can't use.
November 12, 2018 at 10:07 pm
AdrianLParker+sqlservercentral.com - Monday, November 12, 2018 6:34 AMLynn Pettis - Sunday, November 11, 2018 7:27 PMAdrianLParker+sqlservercentral.com - Sunday, November 11, 2018 5:53 PMIt is an OR. That means it is true if either or both side are true. The query would return a row of data if somePrimaryKey = 3, it also will return all rows of data where anyColumn > 6.
But it's quicker to check if somePrimaryKey = 3 then it is to check the anyColumn > 6 isn't it? If the indexed column satisfied there where condition, it would be quicker to check it first right?
In the query given the UNION would evaluate both queries every time? If so, then the two queries are not equivalent in execution times.
The problem is that your WHERE clause says WHERE somePrimaryKe y = 3 OR anyColumn > 6. This means that the results set returned will contain the row of data where somePrimaryKey = 3 (if it exists) and all the rows of data where anyColumn > 6 (if any exist). SQL has to look for data that matches either OR both conditions.
Viewing 15 posts - 16 through 30 (of 31 total)
You must be logged in to reply to this topic. Login to reply