March 21, 2013 at 7:57 pm
I am playing around with and continuing to learn SQL. I am using the AdventureWorks2012 database with the following query to get a list of people's first and last name and job title, with their job title = Sales Representative and ordering by Last Name:
SELECT p.FirstName
,p.LastName
,e.JobTitle
FROM AdventureWorks2012.Person.Person p
JOIN AdventureWorks2012.HumanResources.Employee e
ON p.BusinessEntityID = e.BusinessEntityID
WHERE e.JobTitle = 'Sales Representative'
ORDER BY p.LastName
I also created an index on each table which my query uses:
USE [AdventureWorks2012]
GO
/****** Object: Index [IX_JobTitle_BusinessEntityID] Script Date: 3/21/2013 8:48:01 PM ******/
CREATE NONCLUSTERED INDEX [IX_JobTitle_BusinessEntityID] ON [HumanResources].[Employee]
(
[JobTitle] ASC,
[BusinessEntityID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
--------------------------------
USE [AdventureWorks2012]
GO
/****** Object: Index [IX_LastName_BusinessEntityID_IN_FirstName] Script Date: 3/21/2013 8:48:52 PM ******/
CREATE NONCLUSTERED INDEX [IX_LastName_BusinessEntityID_IN_FirstName] ON [Person].[Person]
(
[BusinessEntityID] ASC,
[LastName] ASC
)
INCLUDE ( [FirstName]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
If I run this query without the order by clause or with a column in the Employee table instead of p.LastName, I do not get a sort operator in the execution plan. When ordering by last name, I get a sort operator between the Nested Loops (Inner Join) and Select in the Execution Plan.
The question becomes would there be a way to remove the sort operator and do the sorting in one of the indexes before the Nested Loops operator or by changing my query?
Should I not worry about and this would be the best it could get?
I am not fully sure what I am looking for within the execution plan so any help would be appreciated.
Thank you.
March 21, 2013 at 10:25 pm
Ken,
Since BusinessEntityId is the Clustered Key, the data is sorted in the table is physically stored in its order. And since BusinessEntityId is being used in the join condition, it is expected that the optimizer would use that Clustered key to get the data and then later sort it by the lastname.
Now if you force the query to use the last name index (which is IX_Person_LastName_FirstName_MiddleName in this case), you would expect the results to be read in order of the index (query below).
SELECT p.FirstName ,p.LastName ,e.JobTitle
FROM AdventureWorks2012.Person.Person p WITH (INDEX (IX_Person_LastName_FirstName_MiddleName))
INNER JOIN AdventureWorks2012.HumanResources.Employee e ON p.BusinessEntityID = e.BusinessEntityID
WHERE e.JobTitle = 'Sales Representative'
ORDER BY p.LastName
However there is still a sort operator in the plan. This may be because the optimizer decided that going thru the index in an optimal way, and not necessarily sorting the output might be beneficial in the overall query. So it grabs the data as it can and then it sorts in the end to satisfy the order by clause. If you select the properties of the NonClustered Index Scan operator for the above query, you will see that the output is not ordered.
However, if you use Fast 1 option as below, then the scan of the non-clustered index is sorted (look at the properties of the operator; Ordered is set to true.) And thus a sort is avoided in the later stages.
SELECT p.FirstName ,p.LastName ,e.JobTitle
FROM AdventureWorks2012.Person.Person p WITH (INDEX (IX_Person_LastName_FirstName_MiddleName))
INNER JOIN AdventureWorks2012.HumanResources.Employee e ON p.BusinessEntityID = e.BusinessEntityID
WHERE e.JobTitle = 'Sales Representative'
ORDER BY p.LastName
OPTION (FAST 1)
Instead of me parroting what the experts say, please read here and here to see the reasons why.
So you can get to avoid the sort operator, but it has its costs and one main advantage (data streaming can begin early; as written in the second link).
March 22, 2013 at 6:22 am
I'd have to test it to be sure, but you might be able to create an index with LastName as the leading edge and the other columns as part of the INCLUDE clause. Although, you don't need to add the clustered key to any non-clustered index because it's already going to be there as part of the structure as provided by SQL Server.
I'd be hard pressed to say more without seeing the execution plan myself.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 22, 2013 at 9:24 am
Thanks you Sam. That answered my question and gave me something to look at and test.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply