April 18, 2013 at 12:46 am
I have a Table called Employees with following Columns
EmployeeID INT
EmployeeName VARCHAR(256)
Salary MONEY
DeptID SMALLINT
The table has following Indexes created on it
1. Clustered Index on EmployeeID
2. Non Clustered Index on EmployeeName
3. Non Clustered Index on DeptID
The Employees tables has over 10 million rows.
You want to get the count of Employees, for which you write the following query
SELECT COUNT(*) FROM Employees
What will be the likely query plan for above query? Which index will SQL SERVER use in above query? Please give an explanation for your answer.
April 18, 2013 at 1:54 am
Looks and sounds like an interview question or homework to me. Which one do you think?
April 18, 2013 at 2:05 am
Why don't you run the query and see which index and execution plan it will use? If you would like help understanding why SQL picked the specific plan that it did, feel free to ask.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 18, 2013 at 4:03 am
Thanks For The Reply Monster ,
i am new to Sql server.Running The Query
on millions of records is completly different from running the query on small datasets as i think.Hence i want to run the query on millions of records and
know which indexes are getting used.
April 18, 2013 at 4:21 am
Recently I've been doing lots of work on large data sets, wide tables (20+ columns) with mixture of Unicode and non-Unicode data types and IO becomes a real issue when dealing with large tables. With this experience plus the great advice I've read about on this site I've come to realise the importance of IO and how it plays a big part in how SQL server ultimately does what you ask of it.
To answer your question, SQL would likely choose the index with the smallest storage requirement so that would be the DeptID as its TINYINT so requires fewer pages to store so less IO. In other words, it's probably the most efficient way to use the index with the smalles storage requirement to retrun the table count.
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
April 18, 2013 at 4:27 am
Thank You Dina,
Good Explanation
April 18, 2013 at 4:50 am
Ananth@Sql (4/18/2013)
Running The Queryon millions of records is completly different from running the query on small datasets as i think.Hence i want to run the query on millions of records and know which indexes are getting used.
So create a table, populate it with a few million rows, run the query and see how it behaves.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 18, 2013 at 5:05 am
To illustrate this I did this simple test where I did a count(*) on a table (34 million records) with just a clustsred index (picture 1), I then created a non clustsred index on a BIGINT column and did another count (Picture 2)
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
April 18, 2013 at 5:07 am
You can apply logic to what you think the execution plan will look like, but in my experience trying to second guess the Optimizer is notoriously difficult, as my next post will show.
i.e. taking into account indexes, statistics and existing cached plans and the query construction itself, the plan may not be what logic would dictate.
April 18, 2013 at 5:11 am
Thank you Dina
Your Explanation And presentation Makes Me Clear Of This Question.
And for a clarification tell Me The datatype Of Clustered Index Column.
April 18, 2013 at 5:46 am
Just to illustrate what Gail was saying. . .
You could easily test this out yourself.
USE tempdb;
IF object_id('dbo.Employees') IS NOT NULL
BEGIN
DROP TABLE dbo.Employees;
END;
--10,000,000 Random rows of data
SELECT TOP 10000000 IDENTITY(INT,1,1) AS EmployeeID,
CAST(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
REPLACE(CHAR((ABS(CHECKSUM(NEWID())) % 25) + 65)+LOWER(REPLACE(
CAST(NEWID() AS VARCHAR(36)),'-','')) COLLATE Latin1_General_BIN2,'0',''),'1',''),'2',''),'3',''),
'4',''),'5',''),'6',''),'7',''),'8',''),'9','') AS VARCHAR(255)) AS EmployeeName,
CONVERT(MONEY,CONVERT(VARCHAR(6),CONVERT(MONEY,RAND(CHECKSUM(NEWID())) * 100),0)) AS Salary,
CAST((ABS(CHECKSUM(NEWID())) % 100) + 1 AS SMALLINT) AS DeptID
INTO dbo.Employees
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3;
CREATE CLUSTERED INDEX Employees_cl_EmployeeID ON dbo.Employees (EmployeeID);
CREATE NONCLUSTERED INDEX Employees_nc_EmployeeName ON dbo.Employees (EmployeeName);
CREATE NONCLUSTERED INDEX Employees_nc_DeptID ON dbo.Employees (DeptID);
So the above will produce an "Employees" table with 10,000,000 rows in it and the indexes you stated.
April 18, 2013 at 5:53 am
It came down to the number of pages used by the index (fewer pages = less IO)
The top row is the clustsred index, the bottom row is the non clustered index I created recently. SQL ended up using the index with the fewest number of pages.
Try it out as the monster had suggested 😀
Good luck!
p.s.
Before anyone starts bashing me for having so many indexes on a table, YES, these are necessary and you dont' know what I'm doing and what the table is for... lol 😛
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
April 18, 2013 at 7:14 am
Ananth@Sql (4/18/2013)
Thank you DinaYour Explanation And presentation Makes Me Clear Of This Question.
And for a clarification tell Me The datatype Of Clustered Index Column.
Not necessarily relevant as the clustered index at the leaf level is the table. This means there could be as few as 1 record per page depending on how wide the row is and SQL Server is doing a clustered index scan. This means it is scan the table using a lot of IO.
April 18, 2013 at 7:26 am
Lynn Pettis (4/18/2013)
Ananth@Sql (4/18/2013)
Thank you DinaYour Explanation And presentation Makes Me Clear Of This Question.
And for a clarification tell Me The datatype Of Clustered Index Column.
Not necessarily relevant as the clustered index at the leaf level is the table. This means there could be as few as 1 record per page depending on how wide the row is and SQL Server is doing a clustered index scan. This means it is scan the table using a lot of IO.
Yes, I forgot to mention that my clustsred index is on an ID column that's an INT but as you say, the leaf pages of a clustsred index and the data pages of the table the index is on is one of the same so if it's a wide table it will need more pages than a non-clustsred index on a BIGINT column as in my example.
Apologies to the OP for not clarifying this eariler.
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply