Index and or constraint

Question

Index and or constraint

andis59@gmail.com

SSCommitted

Points: 1884
More actions
April 4, 2017 at 2:02 pm

#329104

I have this table
CREATE TABLE [dbo].[LaserData](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [MachineId} [nvarchar](50) NULL,
    [F1] [nvarchar](50) NOT NULL,
    [Kant] [char](1) NULL,
-- number of columns removed
)
Id is the primary key (needed for Entity Framework). The columns MachineId + F1 + Kant must be unique! The default sorting is on these three columns. Searching will usually be on F1 and all of them. also query for a specific MachineId
Should I create a index
CREATE UNIQUE NONCLUSTERED INDEX table ON table([MachineId] ASC, [F1] ASC, [Kant] ASC)
or make two or more indexes for the columns and create an constraint for the uniqueness of the three columns?
// Anders
PS! I'm using SQL Server 2014 (but I couldn't find any Newbie subforum under the 2014 forum

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply

Mr. Brian Gale One Orange Chip Points: 25007 More actions · Answer 1

I'd do a constraint on the 3 columns for sure since you are saying they need to be unique. But when you say they need to be unique, do you mean that you could have the following 2 values:
MachineId = 'bob'
F1 = 'bob'
Kant = 'b'
and
MachineId = 'bob'
F1 = 'bob'
Kant = 'c'
or can there be no duplicates in any of those columns? This would change how I would design this. My interpretation of what you said is that the above should be allowed since the 3 columns added together would give you unique value.

It might be easier to normalize this and use foreign key lookups. It would make the data lookup faster and will result in smaller tables presuming some of those values are allowed to replicate in the table. Data lookup could be faster based on how you defined your indexes.

As for indexes, it depends. If the table has a lot of insert/update/delete statements run against it, multiple indexes may not be the way to go. If it is mostly selects being run against it, having more focused indexes will result in faster queries.
Also, if there are a lot of inserts happening, your identity column may not be set up quite right. You can double the number of ints you can use by setting it to start at negative 2,147,483,647 (-2,147,483,647) instead of starting at 1. This will double the number of inserts you can do before you run out of ints. The other option is to use bigint instead of int.
But I think normalizing it will give you the best performance increase and you can index the other tables for fast lookups.

The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Chris Harshman SSC-Forever Points: 42192 More actions · Answer 2

If you were to make 2 indexes for the columns, what would the other one be? Creating a unique constraint does create an index so that isn't any different. One index should be fine, having the first column being the one that's queried the most. When you say "Searching will usually be on F1 and all of them" does that mean each search will include all 3 columns?

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 3

andis59 - Tuesday, April 4, 2017 2:02 PM
I have this table
CREATE TABLE [dbo].[LaserData](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [MachineId} [nvarchar](50) NULL,
    [F1] [nvarchar](50) NOT NULL,
    [Kant] [char](1) NULL,
-- number of columns removed
)
Id is the primary key (needed for Entity Framework). The columns MachineId + F1 + Kant must be unique! The default sorting is on these three columns. Searching will usually be on F1 and all of them. also query for a specific MachineId
Should I create a index
CREATE UNIQUE NONCLUSTERED INDEX table ON table([MachineId] ASC, [F1] ASC, [Kant] ASC)
or make two or more indexes for the columns and create an constraint for the uniqueness of the three columns?
// Anders
PS! I'm using SQL Server 2014 (but I couldn't find any Newbie subforum under the 2014 forum

Unique constraints are enforced by an index.

What kind of values will be contained in the F1 column and what is the average row length of the table?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

andis59@gmail.com SSCommitted Points: 1884 More actions · Answer 4

Here is some more information.
The data is used for creating a marking of a product. There already is a database and we now need to add an identification of the marking machine, since the same article will have different layouts. Due to the fact that this database was originally an Excel document we have column names like F1, F2, .... The table have 26 columns that all can be used in the layout.

The columns for identification are
MachineId The Id of the machine
F1 the Article number
Kant The side of the product that the marking should be on

The MachineId will be used as a 'grouping' where each group may have the same F1 + Kant.
MachineId + F1 + Kant must be unique

This is the current table (from MS SQL Server Management Studio) to which I need to add MachineId.

CREATE TABLE [dbo].[LaserData]( [Id] [int] IDENTITY(1,1) NOT NULL,, [F1] [nvarchar](50) NOT NULL, [Kant] [char](1) NULL, [Avdelning] [nchar](50) NULL, [F2] [nvarchar](50) NULL, [F3] [nvarchar](50) NULL, [F4] [nvarchar](50) NULL, [F5] [nvarchar](50) NULL, [F6] [nvarchar](50) NULL, [F7] [nvarchar](50) NULL, [F8] [nvarchar](50) NULL, [F9] [nvarchar](50) NULL, [F10] [nvarchar](50) NULL, [BC1] [nvarchar](50) NULL, [BC2] [nvarchar](50) NULL, [Template] [nvarchar](50) NULL, [P1] [nvarchar](50) NULL, [P2] [nvarchar](50) NULL, [P3] [nvarchar](50) NULL, [P4] [nvarchar](50) NULL, [P5] [nvarchar](50) NULL, [P6] [nvarchar](50) NULL, [FixtureId] [nvarchar](50) NULL, [ExternTest] [bit] NULL, [EnableTO] [bit] NULL, [TOnr] [nvarchar](50) NULL, CONSTRAINT [PK_LaserData] PRIMARY KEY CLUSTERED ( [Id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY], CONSTRAINT [IX_laserdata_F1_Kant] UNIQUE NONCLUSTERED ( [F1] ASC, [Kant] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]

andis59@gmail.com SSCommitted Points: 1884 More actions · Answer 5

bmg002 - Tuesday, April 4, 2017 3:05 PM
I'd do a constraint on the 3 columns for sure since you are saying they need to be unique. But when you say they need to be unique, do you mean that you could have the following 2 values:
MachineId = 'bob'
F1 = 'bob'
Kant = 'b'
and
MachineId = 'bob'
F1 = 'bob'
Kant = 'c'
or can there be no duplicates in any of those columns? This would change how I would design this. My interpretation of what you said is that the above should be allowed since the 3 columns added together would give you unique value.
It might be easier to normalize this and use foreign key lookups. It would make the data lookup faster and will result in smaller tables presuming some of those values are allowed to replicate in the table. Data lookup could be faster based on how you defined your indexes.
As for indexes, it depends. If the table has a lot of insert/update/delete statements run against it, multiple indexes may not be the way to go. If it is mostly selects being run against it, having more focused indexes will result in faster queries.
Also, if there are a lot of inserts happening, your identity column may not be set up quite right. You can double the number of ints you can use by setting it to start at negative 2,147,483,647 (-2,147,483,647) instead of starting at 1. This will double the number of inserts you can do before you run out of ints. The other option is to use bigint instead of int.
But I think normalizing it will give you the best performance increase and you can index the other tables for fast lookups.

I have added some info which hopefully will answer the first question you have, but the example you give would be correct. There may also be
MachineId = 'ann'
F1 = 'bob'
Kant = 'b'
and
MachineId = 'ann'
F1 = 'bob'
Kant = 'c'

I.e. The same F1+kant may be used with another MachineId.

The table is mainly used for querying. The only time it's used for inserting/editing is when there is a new article or some of the marking data changes. When querying all three columns will be used in the where clause.
The table contains about 1500 rows at the moment an we estimate that we might reach 5000 in a couple of years. So int will probably be sufficiant 😉

Being a beginner at databases and SQL; How would I normalize this table?

Thank you for your answer!

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 6

andis59 - Wednesday, April 5, 2017 12:35 AM
Here is some more information.
The data is used for creating a marking of a product. There already is a database and we now need to add an identification of the marking machine, since the same article will have different layouts. Due to the fact that this database was originally an Excel document we have column names like F1, F2, .... The table have 26 columns that all can be used in the layout.
The columns for identification are
MachineId The Id of the machine
F1 the Article number
Kant The side of the product that the marking should be on
The MachineId will be used as a 'grouping' where each group may have the same F1 + Kant.
MachineId + F1 + Kant must be unique
This is the current table (from MS SQL Server Management Studio) to which I need to add MachineId.
CREATE TABLE [dbo].[LaserData]( [Id] [int] IDENTITY(1,1) NOT NULL,, [F1] [nvarchar](50) NOT NULL, [Kant] [char](1) NULL, [Avdelning] [nchar](50) NULL, [F2] [nvarchar](50) NULL, [F3] [nvarchar](50) NULL, [F4] [nvarchar](50) NULL, [F5] [nvarchar](50) NULL, [F6] [nvarchar](50) NULL, [F7] [nvarchar](50) NULL, [F8] [nvarchar](50) NULL, [F9] [nvarchar](50) NULL, [F10] [nvarchar](50) NULL, [BC1] [nvarchar](50) NULL, [BC2] [nvarchar](50) NULL, [Template] [nvarchar](50) NULL, [P1] [nvarchar](50) NULL, [P2] [nvarchar](50) NULL, [P3] [nvarchar](50) NULL, [P4] [nvarchar](50) NULL, [P5] [nvarchar](50) NULL, [P6] [nvarchar](50) NULL, [FixtureId] [nvarchar](50) NULL, [ExternTest] [bit] NULL, [EnableTO] [bit] NULL, [TOnr] [nvarchar](50) NULL, CONSTRAINT [PK_LaserData] PRIMARY KEY CLUSTERED ( [Id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY], CONSTRAINT [IX_laserdata_F1_Kant] UNIQUE NONCLUSTERED ( [F1] ASC, [Kant] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]

I'm not seeing a "MachineID" in that table. And... wow... just WOW! Apparently the original designer of the table had no concept in the usefulness of meaningful names regardless of what the source was. To be honest, that's the first thing I'd fix so that a Developer doesn't need to rely on tribal knowledge or external documentation to know what's in a given column.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

andis59@gmail.com SSCommitted Points: 1884 More actions · Answer 7

Jeff Moden - Wednesday, April 5, 2017 7:50 AM
andis59 - Wednesday, April 5, 2017 12:35 AM
This is the current table (from MS SQL Server Management Studio) to which I need to add MachineId.
I'm not seeing a "MachineID" in that table. And... wow... just WOW! Apparently the original designer of the table had no concept in the usefulness of meaningful names regardless of what the source was. To be honest, that's the first thing I'd fix so that a Developer doesn't need to rely on tribal knowledge or external documentation to know what's in a given column.

No, as I said I need to add this column.
The first thought that went through my mind when I saw the tables, was to change the column names. But the names are used in many, many layouts (that define how and what should be marked), so it wasn't realistic to make this change. Also the columns are used for product specific data so the same column can have different meaning for different products...

Jack Corbett SSC Guru Points: 184397 More actions · Answer 8

My rule of thumb for creating a constraint vs an index is the purpose of the object. If the main purpose is to enforce a business key (uniqueness) I create a constraint, if it is mainly for querying then it I create an index. Since both do create an index, it really is semantics.

As has already been said, if you can, now is the time to fix the naming and make the column names more descriptive.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Mr. Brian Gale One Orange Chip Points: 25007 More actions · Answer 9

andis59 - Wednesday, April 5, 2017 12:47 AM
bmg002 - Tuesday, April 4, 2017 3:05 PM
I'd do a constraint on the 3 columns for sure since you are saying they need to be unique. But when you say they need to be unique, do you mean that you could have the following 2 values:
MachineId = 'bob'
F1 = 'bob'
Kant = 'b'
and
MachineId = 'bob'
F1 = 'bob'
Kant = 'c'
or can there be no duplicates in any of those columns? This would change how I would design this. My interpretation of what you said is that the above should be allowed since the 3 columns added together would give you unique value.
It might be easier to normalize this and use foreign key lookups. It would make the data lookup faster and will result in smaller tables presuming some of those values are allowed to replicate in the table. Data lookup could be faster based on how you defined your indexes.
As for indexes, it depends. If the table has a lot of insert/update/delete statements run against it, multiple indexes may not be the way to go. If it is mostly selects being run against it, having more focused indexes will result in faster queries.
Also, if there are a lot of inserts happening, your identity column may not be set up quite right. You can double the number of ints you can use by setting it to start at negative 2,147,483,647 (-2,147,483,647) instead of starting at 1. This will double the number of inserts you can do before you run out of ints. The other option is to use bigint instead of int.
But I think normalizing it will give you the best performance increase and you can index the other tables for fast lookups.
I have added some info which hopefully will answer the first question you have, but the example you give would be correct. There may also be
MachineId = 'ann'
F1 = 'bob'
Kant = 'b'
and
MachineId = 'ann'
F1 = 'bob'
Kant = 'c'
I.e. The same F1+kant may be used with another MachineId.
The table is mainly used for querying. The only time it's used for inserting/editing is when there is a new article or some of the marking data changes. When querying all three columns will be used in the where clause.
The table contains about 1500 rows at the moment an we estimate that we might reach 5000 in a couple of years. So int will probably be sufficiant 😉
Being a beginner at databases and SQL; How would I normalize this table?
Thank you for your answer!

I agree with Jeff Moden; do those column names mean anything to you and/or your software? If you have done any software development, you likely know the pitfalls of calling a variable "A". Might work fine, but support of the system gets more difficult the longer it is that nobody looks at it.

As for normalizing the table, if it was me, I'd pull the columns that would contain multiple reused values and store them in their own table. Things like machineid could be put into their own "MachineID" table. But with the small expected dataset, I'm not sure I'd spend a lot of time normalizing the database.
There are multiple advantages to normalizing including reduced database size and reusability. You get reduced size due to de-duplication of values as you are now only using ints (or smallints depending on the number of MachineId's you expect) in this table and only 1 MachineId plus an int (or smallint) in the MachineID table and then you can just join the tables. You get reusability as now any table that requires MachineId can pull it from the MachineID table and if you need to add a new MachineID, you just have 1 table to update.

As for fixing the column names, there are solutions if your software relies on the names and it is painful to change all of the software - rename the table and create a view that pulls the data from the table with the column names that the software expects. Then rename the columns on the table. It is a bit more work, but it will make maintaining that table a lot easier if you can give the columns meaningful names. Although, if the columns can have multiple meanings based on product, then renaming the columns doesn't serve a good purpose. In that case, I think re-writing the software to write to different tables for different products so you can give them meaningful names is likely a better solution. But that is also a long term project. I have a similar project to do where the columns were called "userDefined1" through "userDefined10" and depending on the product determines which of those fields are used and what they are used for. Sometimes UserDefined2 is a serial number, sometimes it is a customer number, sometimes it is a test result. I don't control the software unfortunately and I just inherited the database so there isn't a lot that I can do. The business doesn't see the benefit in changing the software when things currently work.

The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 10

andis59 - Wednesday, April 5, 2017 8:15 AM
Jeff Moden - Wednesday, April 5, 2017 7:50 AM
andis59 - Wednesday, April 5, 2017 12:35 AM
This is the current table (from MS SQL Server Management Studio) to which I need to add MachineId.
I'm not seeing a "MachineID" in that table. And... wow... just WOW! Apparently the original designer of the table had no concept in the usefulness of meaningful names regardless of what the source was. To be honest, that's the first thing I'd fix so that a Developer doesn't need to rely on tribal knowledge or external documentation to know what's in a given column.
No, as I said I need to add this column.
The first thought that went through my mind when I saw the tables, was to change the column names. But the names are used in many, many layouts (that define how and what should be marked), so it wasn't realistic to make this change. Also the columns are used for product specific data so the same column can have different meaning for different products...

So, name the table something else and fix it. Create a view with the original table names and columns and the existing apps will never know the difference. Just make sure to follow the rules for creating an updateable view.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

andis59@gmail.com SSCommitted Points: 1884 More actions · Answer 11

Jeff Moden - Wednesday, April 5, 2017 12:09 PM
So, name the table something else and fix it. Create a view with the original table names and columns and the existing apps will never know the difference. Just make sure to follow the rules for creating an updateable view.

The problem is that most of the columns can be used for anything, it's product specific. The only column name that I know could be changed is F1, which is Article Number.

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 12

andis59 - Wednesday, April 5, 2017 12:44 PM
Jeff Moden - Wednesday, April 5, 2017 12:09 PM
So, name the table something else and fix it. Create a view with the original table names and columns and the existing apps will never know the difference. Just make sure to follow the rules for creating an updateable view.
The problem is that most of the columns can be used for anything, it's product specific. The only column name that I know could be changed is F1, which is Article Number.

Heh... <headdesk><headdesk><headdesk><facepalm> 😉 Add an "Oh my" to my previous "Wow!" 😉 They sure did make things un-normalizable (to coin a phrase) for you.

I'm thinking that your plan of using MachineID+F1+Kant for the PK and then indexing the other two columns as however you see fit is about the only thing that you can do to this table. Other than those 3 columns, there's probably no chance at DRI in the form of FKs, either.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

GaryV Hall of Fame Points: 3307 More actions · Answer 13

Jeff Moden - Wednesday, April 5, 2017 4:54 PM
I'm thinking that your plan of using MachineID+F1+Kant for the PK //snip//

He can't - 2 of those columns are nullable.
Which also makes a unique constraint harder to implement, but not impossible.

Mr. Brian Gale One Orange Chip Points: 25007 More actions · Answer 14

It might not be a bad idea to turn this project of "add a column" into a full re-write of all of the software to utilize a better database design? Keep the old table for historical purposes (if needed) and migrate to a completely new setup.

It will be more work short term, but the long term benefits will be worth it in the end.

The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.