June 19, 2006 at 10:10 am
Hi
I have table of following structure. To improve the performance i am planning to add Index on the table. Most of the join use either TENANCYID,SUPPLIERID,EMPLOYEEID.But these column allows null. What is the best way to improve the performance, I was thinking to use
SALEID,TENANCYID,SUPPLIERID,EMPLOYEEID columns as primary key. Is this is a good idea. What’s expert’s advice???
SALEID (PK, CLUSTERED,IDENTITY)
SONAME
SODATE
SONOTES
USERID
TENANCYID
SUPPLIERID
EMPLOYEEID
SALESCUSTOMERID
ACTIVE
SOTYPE
SOSTATUS
COMPANY
SOTIMESTAMP
Thanks
Black
June 19, 2006 at 2:25 pm
Personally (since you said that the joins either use TENANCYID, SUPPLIERID, or EMPLOYEEID, I would add a non-unique index on each one, rather than creating a primary key on all four columns. After all, if I want to join another table to this one based only on EmployeeID, I will generate a table scan with one monolithic query (okay, a scan of the clustered index, which is the same thing). But with a nonclustered index based on EmployeeID, I can do a non-clustered index lookup, followed by a clustered index lookup, which is almost always faster.
CREATE INDEX ix_Employee
ON MySalesTable (EMPLOYEEID)
(Repeat for all three fields)
June 21, 2006 at 7:07 am
Yes, avoid the primary key on all 3 columns, especially a clustered PK. Not everyone is aware that whatever columns you use in a clustered index are ALSO added automatically into every non-clustered index. The wider your clustered index, therefore, the wider all your non-clustered indexes get, and the less efficient those indexes become.
I'd also tend to make your PK a NON-clustered index on SaleID. This allows you to use your clustered index on a column that is more likely to return multiple records (one of the clustered index's biggest strengths). I'm thinking of something like SODATE. Surely you run queries of the form:
SELECT .....
FROM Table T1
INNER JOIN OtherTable T2
ON T1.SUPPLIERID = T2.SUPPLIERID
WHERE T1.SODATE > @DateParam
or ...
SELECT SONAME, SODATE, SONOTES, SOSTATUS
FROM Table
WHERE SODATE >= '1-May-2006'
AND SODATE < '1-Jun-2006'
The WHERE clause here is crying out for a clustered index on the SODATE column. If you put a NON-clustered index on this column (because you've already used your only clustered index on the PK), and if more than a very small percentage of the total rows in the table would qualify for this date range, then the NC index will not be used as the number of bookmark lookups required to get the remaining record attributes will prove too inefficient compared to a full tablescan (clustered index scan in reality, but it amounts to the same thing).
Brendt's suggestions for NC indexes on the other columns look fine, but ultimately it all depends on how the different possible values for those columns are spread throughout the table. In the end you can only find the best indexes by testing queries using the graphical showplan feature in Query Analyzer to check whether the optimizer will actually use them or not. The index tuning wizard can be a good starting point, too, if you're unsure, but personally I find it generates some rather scary indexes and recommends you drop a lot, too, so I treat it with a good deal of caution and prefer to use my own experience.
June 21, 2006 at 9:15 am
Black
Are you confusing the terms "primary key" and "clustered index"? The primary key uniquely identifies each row in your table. If SALEID alone does this, then that should be your primary key.
When you create a primary key constraint, a clustered index will automatically be created on the column(s) in the key (if one doesn't already exist on other column(s)). If you don't want that (see advice above) then specify a non-clustered index when you create the PK constraint. You can only have one clustered index in each table, so choose carefully!
John
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply