July 18, 2011 at 2:56 pm
Want to get experienced people's feedback to see if I'm understanding the basics of indexes:
Creating a table to complement our ERP database.
Basically a Job is broken down into Forms, so JobNum is not unique and FormNum is not unique, but a JobNum/FormNum combination is unique.
JobNum - int
FormNum - byte
RlsState - NVCHAR(1)
RecID - Identity
Unique database identifier is a JobNum / FormNum - there can be only 1.
Most queries will select WHERE JobNum = {value} and FormNum = {value}
I'm guessing the PKis the RecID (1 field).
A 2nd index is a JobNum, FormNum combination.
Am I thinking correctly?
[font="Verdana"]"The Road To Success Is Always Under Construction."[/font]
July 18, 2011 at 3:08 pm
Looks about right. If JobNum/FormNum combination is unique then that should be enforced with a unique index.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 18, 2011 at 3:11 pm
p.s. RlsState - NVARCHAR(1)???
That's bad practice in a number of ways. It takes a minimum of 4 bytes to store 1 character. For something so small it should be CHAR(1) if just ascii characters or at worse NCHAR(1) if it also needs unicode.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 18, 2011 at 8:51 pm
NVARCHAR(1) !!
I love it. I have to change my signature line now.
Seriously, I do want to ask a question, since you said that JobNum/FormNum is unique. What percent of your queries involve looking for an individual form for a single job, as opposed to getting all forms for one or more jobs?
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
July 19, 2011 at 2:54 am
The question for me is, what are you planning on clustering on? Based on your post, I'd suggest the combined index, not the PK, because that sounds like it's going to be the most frequently accessed path to the data.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
July 19, 2011 at 3:58 am
BTW - the Front End is Access, rarely use Pass Thru queries.
Gail ... Thanks for the information about NVARCHAR. Did NOT know that. Googled and found a Tech Republic webpage that provides information about using fixed vs variable and ANSI vs Unicode. 10+ Common Questions About SQL Server Data Types.
Dixie & Grant ... I reviewed the SELECTs between JobNum to JobNum/FormNum ... What I stated was incorrect. In this particular form the ratio is 2 to 1 Job vs Job/Form. Speculating on future development it could go either way, so, I would say 50/50.
Here's where I see I'm not clear about creating indexes. My understanding is to strive for a PK that is based on a single field, not multiple; and, any indexes that require multiple fields are created. I'm so confused! :hehe:
[font="Verdana"]"The Road To Success Is Always Under Construction."[/font]
July 19, 2011 at 4:50 am
Don't confuse primary key and clustered index. I wasn't questioning your PK (although I could, :w00t: ). I like to think about how the data is going to most often be retrieved from a table and consider that for the cluster. If you don't, what happens is, you index for that access path, which is good, but then you have to do a key lookup operation to get the data that is not part of the non-clustered index, and this is something you have to do for the majority of your database calls. So, I look at access paths and tend to let that drive me on my decisions around the clustered index because you only get one clustered index and that's where the data is stored.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
July 19, 2011 at 8:26 am
And the plot thickens... :laugh: :crying:
Well, that is exactly what I did!. My notes from reading about indexes are not correct. I thought (wrote in my notes) the PK was the clustered index. I reread the primary article - Gail Shaw's Introduction To Indexes[/url]. Can see where I mis-interpreted information. So that we are clear: My PK and clustered index understanding did NOT come from Gail's article.
So, now I have lots of questions...
What is Primary Key vs clustered index?
I think there is a difference of opinion between Gail and your replies.
Help.
P.S. Thanks for taking the time to help me.
[font="Verdana"]"The Road To Success Is Always Under Construction."[/font]
July 19, 2011 at 8:37 am
The reason for the question was that if you are usually looking for all the forms belonging to a particular job, it might be a better idea to have the clustered index on jobnum and formnum. This would make the data for each form fall together on the same page, or consecutive pages, which would result in reduced I/O. This isn't always the best practice, because the key to the clustered index is always included in any nonclustered indexes for a table, so the bigger the clustered index key, the bigger all nonclustered indexes grow. But in your case, the formnum column adds only a single character, making the idea worth consideration.
However, if most of your queries are going to be searching for a single form this probably is not the way to go.
By the way, Grant and Gail, I'm signed up for your pre-con session at PASS and looking forward to it.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
July 19, 2011 at 8:42 am
Nope. Gail said that the two values combined would make a unique index, and they will. They could also be a clustered index. They could also be a primary key (as a unique index they're already an alternate key). But you can combine them to make a clustered unique index or a clustered primary key.
Here's an MSDN article to get you started on Primary Keys. Basically, a primary key is a unique constraint on a table that acts as an identifier for the rows on that table. It can be one or more columns of just about any of the primitive data types. It's a fundamental building block of relational data storage along with foreign key constraints which relate one table to another, usually through that primary key.
A clustered index is a storage mechanism that stores the data from the table in the physical order (mostly) of the key value(s) for the index. they are only related because by default in SQL Server, unless you tell it otherwise, the primary key is clustered automatically. Otherwise, they're not really related to one another directly. Since data is stored on the clustered index, I find it best to use the most common access path to the data as the clustered key. Frequently, probably even most of the time, that's the primary key. But other times it's other columns, like an alternate key (as you're looking at) or foreign keys or just other columns, that make a better cluster. It just depends on the storage mechanisms and the data types and the types of queries most frequently used to access the data.
In short, it's a huge topic, but one you should start learning because you only get a single clustered index on any given table, so getting that right is very important.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
July 19, 2011 at 8:44 am
EdA ROC (7/19/2011)
And the plot thickens... :laugh: :crying:Well, that is exactly what I did!. My notes from reading about indexes are not correct. I thought (wrote in my notes) the PK was the clustered index. I reread the primary article - Gail Shaw's Introduction To Indexes[/url]. Can see where I mis-interpreted information. So that we are clear: My PK and clustered index understanding did NOT come from Gail's article.
So, now I have lots of questions...
What is Primary Key vs clustered index?
I think there is a difference of opinion between Gail and your replies.
Help.
P.S. Thanks for taking the time to help me.
The clustered index dictates the sequence of the table data. The clustered index *is* the table, which is why there can only be one clustered index, but many non-clustered indexes. This is also the reason the clustered index key is automatically included in any non-clustered index: because each row in the non-clustered index has to be able to point to its corresponding row in the clustered index.
The primary key establishes a unique identifier for each row in the table. It is possible to have the clustered index based on columns other than the primary key, in which case uniqueness will be enforced by a nonclustered index over the primary key.
Yes, there is a difference of opinion between their replies and they are both very knowledgeable about SQL indexing. Trust me, you can learn a great deal by taking the time to understand why their opinions differ. Good luck to you. 🙂
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
July 19, 2011 at 8:53 am
EdA ROC (7/19/2011)
I thought (wrote in my notes) the PK was the clustered index.
By default it is, but that's by default only and not by requirement
I think there is a difference of opinion between Gail and your replies.
Between Grant and I. Yes, we've got different angles from which we approach selecting a clustered index. There's no absolute, single, one rule here, just guidelines.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 3, 2011 at 2:26 pm
Thanks all for your advice and help so far. I think I'm getting close to getting my head around indexes - as in, my mind "sees" the solution and it makes sense, it's "obvious". That said, another project got bumped up and it's probably a good teaching case study.
Just when I think I'm getting it ... along comes something that trips me up.
I believe I have correctly determined the indexes for two of the tables (below) - they are PK and will be clustered.
The question is: What Clustered index do I create for the table UserMenus?
Access FE / SQL BE.
Project: Create an Access Main Menu for all the reports that have been created (each report, or report set, has been created in a separate Access database. We want to put them all into one "app" for the user.
Tables:
Users
- NetworkID (vchar, 16) - User's Network Login ID (easily retrieved in Access)
- ERPID (vchar,16) - User's login ID in the ERP System. This provides UserName and other data from the ERP table containing User information.
Index: PK_NetworkID (I've read a lot about naming conventions - PK prefix/suffix, include table name and not - your thoughts?)
NetworkID rarely changes (there are more blue moons than changes)
Note: The reason I am grabbing the ERPID is (1) I can also retrieve the user name and other information from the table in the ERP system about the user, and (2) a future enhancement will allow another person to "login" (using the ERPID and password) at another person's PC ...Scenario: I am helping you get some information and there is a report I have access to that you don't, I will be able to "Change User", see my menu, and select from it. [The Main Menu app will switch back to the initial user's menu after 1 selection, therefore, I can walk away, forgetting to "logout", and that person can't access my menu.]
The Tables:
MenuItems - All the Menu Items which can display on the Users' Menus. Some entries are Access Forms, others are Grouping headers, i.e. Inventory -or- Customer Service -or- Accounting AP, etc.
- MenuItemID (int [Identity])
- Item - What is displayed on the User's menu
- Seq - Sorting sequence so Menu Items are in a logical order
- Form - Access form name that will be opened when selected
Index: PK_MenuItemID
* Estimate: 75 entries, as of today's count.
UserMenus - Each user will have their own menu items available.
- AmtechID - See Users
- MenuItemID - See MenuItems
Index: ??? - AmtechID only? (Based on query)
This table will have an ebb and flow of records - New users, users changing responsibilities - adding and removing their menu items.
* MenuItems Per User = Ranges from 1 to 28, Average about 8/person.
Therefore, the tables are not large and index performance insignificant, but, the principles are what's important.
There is really only one SQL Statement:
SELECT Item, Form
FROM UserMenus
LEFT JOIN MenuItems on MenuItems.MenuItemID = UserMenus.MenuItemID
WHERE AmtechID = {value at runtime}
ORDER By Seq
Any other queries will be executed by me to manage the Main Menu. (Not often executed)
Final Note: I often see tables with a column RowID or RecID and it's an Identity field. In our ERP database every table has a RowID. I can see its value as an absolute unique and present identifier and I've used it sometimes when troubleshooting query results. What are your thoughts about including this column?
[font="Verdana"]"The Road To Success Is Always Under Construction."[/font]
August 4, 2011 at 12:58 pm
EdA ROC (7/19/2011)
Googled and found a Tech Republic webpage that provides information about using fixed vs variable and ANSI vs Unicode. 10+ Common Questions About SQL Server Data Types.
BTW, that article has an error. As of SQL 2005 there are VARCHAR(MAX), NVARCHAR(MAX) and XML() data types that can hold substantially more text than the pre-2005 limit of 8000. As the article was written in 2008 and mentions new data types for SQL 2008 (at the bottom), this was an omission and wasn't merely out of date.
Rich
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply