Clustered Index Width, Choice of Columns

Question

Clustered Index Width, Choice of Columns

DB_Newbie2007

SSCarpal Tunnel

Points: 4449
More actions
April 25, 2011 at 12:09 pm

#236227

I was trying to do some research and could not find any documentation (easily), so I though I would post the question here.
I have a table as follows that the developers just provided to me this morning:
CREATE TABLE [dbo].[TDCScans](
[InvoiceNo] [varchar](24) NOT NULL,
[Provider] [char](1) NOT NULL,
[ScanDateTime] [smalldatetime] NOT NULL, ---- Actual time "Activity" occurred for the InvoiceNo
[Location] [varchar](50) NULL,
[Activity] [varchar](250) NULL,
[ScanNote] [varchar](500) NULL,
[TrackedTime] [smalldatetime] NULL, ---- Time Data was captured
[Archived] [smalldatetime] NOT NULL, ---- Time Data was sent to Database
[SessionId] [varchar](100) NULL, ---- Some applications provide this, some do not, but this value is static when provided
[DataType] [varchar](3) NULL,
CONSTRAINT [IX_TDCScans] UNIQUE CLUSTERED
(
[InvoiceNo] ASC,
[Activity] ASC,
[Provider] ASC,
[Location] ASC,
[ScanDateTime] ASC,
[ScanNote] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = ON, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
I was writing a note back regarding the use of so many columns to create a "Clustered" index. My suggestion was to create a "UNIQUE NONCLUSTERED" index for the primary key, then a separate index as follows (trying to create an index that is narrow, increasing, and static, but fails the "unique" requirement):
CREATE CLUSTERED INDEX [CI_Scan_InvNo] ON [dbo].[TDCScans]
(
[ScanDateTime] ASC,
[InvoiceNo]
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY];
However, since there can be multiple "InvoiceNo" entries (1 to 5), I suspect SQL Server will be adding a hidden 4 byte integer column, as mentioned in Gail Shaw's article (http://www.sqlservercentral.com/articles/Indexing/68563/).
The question: Is it "better" for us to create a "RowNumber" column (ever increasing INT) versus having SQL server add the hidden column to make the field(s) unique? (Better = Faster DML (Select/Insert/Update/Delete) or smaller clustered index size)
Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

DB_Newbie2007 (4/25/2011)
However, since there can be multiple "InvoiceNo" entries (1 to 5), I suspect SQL Server will be adding a hidden 4 byte integer column, as mentioned in Gail Shaw's article (http://www.sqlservercentral.com/articles/Indexing/68563/).
The question: Is it "better" for us to create a "RowNumber" column (ever increasing INT) versus having SQL server add the hidden column to make the field(s) unique? (Better = Faster DML (Select/Insert/Update/Delete) or smaller clustered index size)

Errr, it depends? 🙂 :hehe:

Seriously, yes it will be narrower (a little). SQL will still have to check to see if it needs a uniquifier, unless you make it a unique clustered index, in which case it will have to check for uniqueness to ensure the constraint isn't violated.

I certainly agree that in this case the PK is not a good place for the cluster. The scan date with Invoice number may be acceptable, especially if there are queries that will be able to use that clustered index (as they would not an index on RowNo).

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Johan Bijnens SSC Guru Points: 135381 More actions · Answer 2

Now you have it: Gail answering an question referencing her article.:hehe:

And she did answer it very good (as is her article).

Nothing to add.

My guideline for clix:

- pick one

- keep it narrow

- make it unique if it is. (if you know it is, define it unique)

tell the system what you know !

- Try to make it useful for your processes. (cfr Gails refer to the datetime column)

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

DB_Newbie2007 SSCarpal Tunnel Points: 4449 More actions · Answer 3

Thanks for the updates/inputs! 🙂

Everything we do (queries) is very "date-centric" so Date is almost always a part of the clustered indexes on our tables, FYI.... date of a transaction or date of an archive of data. Date usually also helps in the "uniqueness" of the data (not so in this case, but most of the time) for us. 🙂

Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)