August 18, 2018 at 3:48 am
Dear All,
I have a Lead table with below fields.
LeadId int (Not a primary key)
Telephone
...
...Table have 4 millions records
Client always search the data with the Telephone number.
I have created only Non clustered index on Telephone Number field.
Should I have also added the primary key constraint on leadId?
If I add the primary key on LeadId column, performance will be increase or not?
August 18, 2018 at 5:14 am
sunilkmr284 - Saturday, August 18, 2018 3:48 AMDear All,
I have a Lead table with below fields.
LeadId int (Not a primary key)
Telephone
...
...Table have 4 millions records
Client always search the data with the Telephone number.
I have created only Non clustered index on Telephone Number field.
Should I have also added the primary key constraint on leadId?
If I add the primary key on LeadId column, performance will be increase or not?
Can you post the full DDL (create table) script and information on what of the data you want to bring back and how the users will do the search?
😎
There is a big difference between handling small result sets and large result sets produced by very generic queries such as searching for anything starting with one or two digits.
August 18, 2018 at 5:30 am
Eirikur Eiriksson - Saturday, August 18, 2018 5:14 AMsunilkmr284 - Saturday, August 18, 2018 3:48 AMDear All,
I have a Lead table with below fields.
LeadId int (Not a primary key)
Telephone
...
...Table have 4 millions records
Client always search the data with the Telephone number.
I have created only Non clustered index on Telephone Number field.
Should I have also added the primary key constraint on leadId?
If I add the primary key on LeadId column, performance will be increase or not?Can you post the full DDL (create table) script and information on what of the data you want to bring back and how the users will do the search?
😎There is a big difference between handling small result sets and large result sets produced by very generic queries such as searching for anything starting with one or two digits.
USE ContactDB
GO
/****** Object: Table [dbo].[tbl_Lead] Script Date: 08/18/2018 16:56:57 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[tbl_Lead](
[LeadID] [int] IDENTITY(1,1) NOT NULL,
[Company Name] [varchar](100) NULL,
[Title] [varchar](50) NULL,
[First Name] [varchar](50) NULL,
[Last Name] [varchar](50) NULL,
[Address1] [varchar](100) NULL,
[Address2] [varchar](100) NULL,
[Address3] [varchar](100) NULL,
[Town] [varchar](50) NULL,
[County] [varchar](50) NULL,
[PostCode] [varchar](20) NULL,
[Telephone] [varchar](20) NULL,
[Email] [varchar](150) NULL
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
Searching
CREATE PROCEDURE [dbo].[Lead_Get]
@PhoneNo varchar(50)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
select * from tbl_lead where Telephone=@PhoneNo
end
August 18, 2018 at 6:46 pm
sunilkmr284 - Saturday, August 18, 2018 5:30 AMEirikur Eiriksson - Saturday, August 18, 2018 5:14 AMsunilkmr284 - Saturday, August 18, 2018 3:48 AMDear All,
I have a Lead table with below fields.
LeadId int (Not a primary key)
Telephone
...
...Table have 4 millions records
Client always search the data with the Telephone number.
I have created only Non clustered index on Telephone Number field.
Should I have also added the primary key constraint on leadId?
If I add the primary key on LeadId column, performance will be increase or not?Can you post the full DDL (create table) script and information on what of the data you want to bring back and how the users will do the search?
😎There is a big difference between handling small result sets and large result sets produced by very generic queries such as searching for anything starting with one or two digits.
USE ContactDB
GO/****** Object: Table [dbo].[tbl_Lead] Script Date: 08/18/2018 16:56:57 ******/
SET ANSI_NULLS ON
GOSET QUOTED_IDENTIFIER ON
GOSET ANSI_PADDING ON
GOCREATE TABLE [dbo].[tbl_Lead](
[LeadID] [int] IDENTITY(1,1) NOT NULL,
[Company Name] [varchar](100) NULL,
[Title] [varchar](50) NULL,
[First Name] [varchar](50) NULL,
[Last Name] [varchar](50) NULL,
[Address1] [varchar](100) NULL,
[Address2] [varchar](100) NULL,
[Address3] [varchar](100) NULL,
[Town] [varchar](50) NULL,
[County] [varchar](50) NULL,
[PostCode] [varchar](20) NULL,
[Telephone] [varchar](20) NULL,
[Email] [varchar](150) NULL
) ON [PRIMARY]GO
SET ANSI_PADDING OFF
GOSearching
CREATE PROCEDURE [dbo].[Lead_Get]
@PhoneNo varchar(50)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;select * from tbl_lead where Telephone=@PhoneNo
end
I would try a clustered primary key on [LeadID] and a unique (if it is unique) index on Telephone with an INCLUDE of all the other columns.
August 19, 2018 at 3:57 am
Jonathan AC Roberts - Saturday, August 18, 2018 6:46 PMI would try a clustered primary key on [LeadID] and a unique (if it is unique) index on Telephone with an INCLUDE of all the other columns.
If all the searches are on an exact phone number, literally duplicating the data does not make too much sense. Therefore, I would suggest a non-clustered index on the Phone number column with the inclusion of the clustered index, searches will then find the matches and bring the full results back via key lookup. Because the phone numbers can be shared, it is futile to use a unique constraint on the index unless the business rules limit the entries to unique numbers, something that we do not know.
😎
August 19, 2018 at 4:58 am
Eirikur Eiriksson - Sunday, August 19, 2018 3:57 AMJonathan AC Roberts - Saturday, August 18, 2018 6:46 PMI would try a clustered primary key on [LeadID] and a unique (if it is unique) index on Telephone with an INCLUDE of all the other columns.If all the searches are on an exact phone number, literally duplicating the data does not make too much sense. Therefore, I would suggest a non-clustered index on the Phone number column with the inclusion of the clustered index, searches will then find the matches and bring the full results back via key lookup. Because the phone numbers can be shared, it is futile to use a unique constraint on the index unless the business rules limit the entries to unique numbers, something that we do not know.
😎
I did say if it was unique.
If this is the query that's most common, and performance is the issues, and the data doesn't have massive amounts of updates; an include all columns will allow a seek without a lookup which will be slightly faster than a lookup.
August 19, 2018 at 1:36 pm
Jonathan AC Roberts - Sunday, August 19, 2018 4:58 AMEirikur Eiriksson - Sunday, August 19, 2018 3:57 AMJonathan AC Roberts - Saturday, August 18, 2018 6:46 PMI would try a clustered primary key on [LeadID] and a unique (if it is unique) index on Telephone with an INCLUDE of all the other columns.If all the searches are on an exact phone number, literally duplicating the data does not make too much sense. Therefore, I would suggest a non-clustered index on the Phone number column with the inclusion of the clustered index, searches will then find the matches and bring the full results back via key lookup. Because the phone numbers can be shared, it is futile to use a unique constraint on the index unless the business rules limit the entries to unique numbers, something that we do not know.
😎I did say if it was unique.
If this is the query that's most common, and performance is the issues, and the data doesn't have massive amounts of updates; an include all columns will allow a seek without a lookup which will be slightly faster than a lookup.
An INCLUDE of all the columns necessary would be a totally unnecessary duplication of data, in this case. That's all indexes really are is a copy of the data sorted in a different order.
We also don't know what the op means by improving the performance. Seems to me that a "lead" table would require singleton lookups by phone number and that just about won't ever cause a performance problem if there's an index on the telephone number. Even a HEAP will do fine in this area. If they're doing something in batches, the OP needs to identify that before we can make recommendations on "performance".
--Jeff Moden
Change is inevitable... Change for the better is not.
August 19, 2018 at 1:37 pm
sunilkmr284 - Saturday, August 18, 2018 3:48 AMDear All,
I have a Lead table with below fields.
LeadId int (Not a primary key)
Telephone
...
...Table have 4 millions records
Client always search the data with the Telephone number.
I have created only Non clustered index on Telephone Number field.
Should I have also added the primary key constraint on leadId?
If I add the primary key on LeadId column, performance will be increase or not?
Performance of WHAT??? How is the table being used and what are the current issues with performance?
--Jeff Moden
Change is inevitable... Change for the better is not.
August 19, 2018 at 1:59 pm
Jeff Moden - Sunday, August 19, 2018 1:36 PMJonathan AC Roberts - Sunday, August 19, 2018 4:58 AMEirikur Eiriksson - Sunday, August 19, 2018 3:57 AMJonathan AC Roberts - Saturday, August 18, 2018 6:46 PMI would try a clustered primary key on [LeadID] and a unique (if it is unique) index on Telephone with an INCLUDE of all the other columns.If all the searches are on an exact phone number, literally duplicating the data does not make too much sense. Therefore, I would suggest a non-clustered index on the Phone number column with the inclusion of the clustered index, searches will then find the matches and bring the full results back via key lookup. Because the phone numbers can be shared, it is futile to use a unique constraint on the index unless the business rules limit the entries to unique numbers, something that we do not know.
😎I did say if it was unique.
If this is the query that's most common, and performance is the issues, and the data doesn't have massive amounts of updates; an include all columns will allow a seek without a lookup which will be slightly faster than a lookup.An INCLUDE of all the columns necessary would be a totally unnecessary duplication of data, in this case. That's all indexes really are is a copy of the data sorted in a different order.
We also don't know what the op means by improving the performance. Seems to me that a "lead" table would require singleton lookups by phone number and that just about won't ever cause a performance problem if there's an index on the telephone number. Even a HEAP will do fine in this area. If they're doing something in batches, the OP needs to identify that before we can make recommendations on "performance".
The query sunilkmr284 gave as the one he wanted to improve the performance of is select * from tbl_lead where Telephone=@PhoneNo
If it is just this query and he wants to make it as fast as possible then an index on PhoneNo with all the other columns included will give the fastest results, hence my comment.
August 19, 2018 at 2:38 pm
Jonathan AC Roberts - Sunday, August 19, 2018 1:59 PMJeff Moden - Sunday, August 19, 2018 1:36 PMJonathan AC Roberts - Sunday, August 19, 2018 4:58 AMEirikur Eiriksson - Sunday, August 19, 2018 3:57 AMJonathan AC Roberts - Saturday, August 18, 2018 6:46 PMI would try a clustered primary key on [LeadID] and a unique (if it is unique) index on Telephone with an INCLUDE of all the other columns.If all the searches are on an exact phone number, literally duplicating the data does not make too much sense. Therefore, I would suggest a non-clustered index on the Phone number column with the inclusion of the clustered index, searches will then find the matches and bring the full results back via key lookup. Because the phone numbers can be shared, it is futile to use a unique constraint on the index unless the business rules limit the entries to unique numbers, something that we do not know.
😎I did say if it was unique.
If this is the query that's most common, and performance is the issues, and the data doesn't have massive amounts of updates; an include all columns will allow a seek without a lookup which will be slightly faster than a lookup.An INCLUDE of all the columns necessary would be a totally unnecessary duplication of data, in this case. That's all indexes really are is a copy of the data sorted in a different order.
We also don't know what the op means by improving the performance. Seems to me that a "lead" table would require singleton lookups by phone number and that just about won't ever cause a performance problem if there's an index on the telephone number. Even a HEAP will do fine in this area. If they're doing something in batches, the OP needs to identify that before we can make recommendations on "performance".
The query sunilkmr284 gave as the one he wanted to improve the performance of is
select * from tbl_lead where Telephone=@PhoneNo
If it is just this query and he wants to make it as fast as possible then an index on PhoneNo with all the other columns included will give the fastest results, hence my comment.
Possibly the fastest - but is really the best option? without knowing the amount of updates/inserts there are on the table we don't even know if having the clustered index being the one based on the telephone isn't the better (which would give the same performance as a index with all columns - and just one tiny index to contain the leadid (which we weren't even told if it is used at all).
Having a index with all the columns is in most of the cases I've seen an overkill. And on this case with 4million rows on a table it does add a lot to the size.
We also don't know what other indexes, if any, exist on the table - the DDL posted is from a "create script" from ssms which by default will not script these, neither the primary key (which from what the OP said is not the leadid - may note have one though)
August 19, 2018 at 3:09 pm
frederico_fonseca - Sunday, August 19, 2018 2:38 PMJonathan AC Roberts - Sunday, August 19, 2018 1:59 PMJeff Moden - Sunday, August 19, 2018 1:36 PMJonathan AC Roberts - Sunday, August 19, 2018 4:58 AMEirikur Eiriksson - Sunday, August 19, 2018 3:57 AMJonathan AC Roberts - Saturday, August 18, 2018 6:46 PMI would try a clustered primary key on [LeadID] and a unique (if it is unique) index on Telephone with an INCLUDE of all the other columns.If all the searches are on an exact phone number, literally duplicating the data does not make too much sense. Therefore, I would suggest a non-clustered index on the Phone number column with the inclusion of the clustered index, searches will then find the matches and bring the full results back via key lookup. Because the phone numbers can be shared, it is futile to use a unique constraint on the index unless the business rules limit the entries to unique numbers, something that we do not know.
😎I did say if it was unique.
If this is the query that's most common, and performance is the issues, and the data doesn't have massive amounts of updates; an include all columns will allow a seek without a lookup which will be slightly faster than a lookup.An INCLUDE of all the columns necessary would be a totally unnecessary duplication of data, in this case. That's all indexes really are is a copy of the data sorted in a different order.
We also don't know what the op means by improving the performance. Seems to me that a "lead" table would require singleton lookups by phone number and that just about won't ever cause a performance problem if there's an index on the telephone number. Even a HEAP will do fine in this area. If they're doing something in batches, the OP needs to identify that before we can make recommendations on "performance".
The query sunilkmr284 gave as the one he wanted to improve the performance of is
select * from tbl_lead where Telephone=@PhoneNo
If it is just this query and he wants to make it as fast as possible then an index on PhoneNo with all the other columns included will give the fastest results, hence my comment.Possibly the fastest - but is really the best option? without knowing the amount of updates/inserts there are on the table we don't even know if having the clustered index being the one based on the telephone isn't the better (which would give the same performance as a index with all columns - and just one tiny index to contain the leadid (which we weren't even told if it is used at all).
Having a index with all the columns is in most of the cases I've seen an overkill. And on this case with 4million rows on a table it does add a lot to the size.
We also don't know what other indexes, if any, exist on the table - the DDL posted is from a "create script" from ssms which by default will not script these, neither the primary key (which from what the OP said is not the leadid - may note have one though)
In my initial answer I did say "If this is the query that's most common, and performance is the issues, and the data doesn't have massive amounts of updates; an include all columns will allow a seek without a lookup which will be slightly faster than a lookup.". So that covers the amount updates/inserts. A clustered index on phone number would get fragmented but on an identity column it would be less prone to fragmentation. And I did say "I would try this index", I didn't say this is definitely the best option.
August 20, 2018 at 3:00 am
Thank you all for suggestion me.
August 20, 2018 at 2:52 pm
I did notice that you have a field of the Telephone as varchar(20) and your lookup @PhoneNo is varchar(50). I would recommend that you have the same type and size so the optimizer will use the intended index correctly all the time. It sounds like this would not make a difference but I have had plenty of cases where that is all it took to speed up the query. Just make sure there is no margin for error as you have here.
Cody L.
August 20, 2018 at 3:20 pm
Jonathan AC Roberts - Sunday, August 19, 2018 3:09 PMfrederico_fonseca - Sunday, August 19, 2018 2:38 PMJonathan AC Roberts - Sunday, August 19, 2018 1:59 PMJeff Moden - Sunday, August 19, 2018 1:36 PMJonathan AC Roberts - Sunday, August 19, 2018 4:58 AMEirikur Eiriksson - Sunday, August 19, 2018 3:57 AMJonathan AC Roberts - Saturday, August 18, 2018 6:46 PMI would try a clustered primary key on [LeadID] and a unique (if it is unique) index on Telephone with an INCLUDE of all the other columns.If all the searches are on an exact phone number, literally duplicating the data does not make too much sense. Therefore, I would suggest a non-clustered index on the Phone number column with the inclusion of the clustered index, searches will then find the matches and bring the full results back via key lookup. Because the phone numbers can be shared, it is futile to use a unique constraint on the index unless the business rules limit the entries to unique numbers, something that we do not know.
😎I did say if it was unique.
If this is the query that's most common, and performance is the issues, and the data doesn't have massive amounts of updates; an include all columns will allow a seek without a lookup which will be slightly faster than a lookup.An INCLUDE of all the columns necessary would be a totally unnecessary duplication of data, in this case. That's all indexes really are is a copy of the data sorted in a different order.
We also don't know what the op means by improving the performance. Seems to me that a "lead" table would require singleton lookups by phone number and that just about won't ever cause a performance problem if there's an index on the telephone number. Even a HEAP will do fine in this area. If they're doing something in batches, the OP needs to identify that before we can make recommendations on "performance".
The query sunilkmr284 gave as the one he wanted to improve the performance of is
select * from tbl_lead where Telephone=@PhoneNo
If it is just this query and he wants to make it as fast as possible then an index on PhoneNo with all the other columns included will give the fastest results, hence my comment.Possibly the fastest - but is really the best option? without knowing the amount of updates/inserts there are on the table we don't even know if having the clustered index being the one based on the telephone isn't the better (which would give the same performance as a index with all columns - and just one tiny index to contain the leadid (which we weren't even told if it is used at all).
Having a index with all the columns is in most of the cases I've seen an overkill. And on this case with 4million rows on a table it does add a lot to the size.
We also don't know what other indexes, if any, exist on the table - the DDL posted is from a "create script" from ssms which by default will not script these, neither the primary key (which from what the OP said is not the leadid - may note have one though)
In my initial answer I did say "If this is the query that's most common, and performance is the issues, and the data doesn't have massive amounts of updates; an include all columns will allow a seek without a lookup which will be slightly faster than a lookup.". So that covers the amount updates/inserts. A clustered index on phone number would get fragmented but on an identity column it would be less prone to fragmentation. And I did say "I would try this index", I didn't say this is definitely the best option.
Okay, IF I was querying the table by the phone number most frequently and pulling back all the data each time then I would cluster the table on the phone number with the LeadId column as the second column of the index to make it unique.
August 20, 2018 at 4:27 pm
@sunilkmr284,
If you are joining to other tables frequently by LeadID then I would create a clustered PK on that. Clustering on the phone number may lead to fragmentation and affect performance negatively. As Jeff already stated, a non-clustered index should perform well for singleton searches. What is the cardinality? Run this query and let us know the results:
SELECT Count(*) TotalCount, Count(DISTINCT Telephone) PhoneCount
ROM dbo.tbl_Lead;
Viewing 15 posts - 1 through 15 (of 23 total)
You must be logged in to reply to this topic. Login to reply