March 8, 2007 at 7:07 pm
Hi SQL Geniuses,
I have recently been commissioned to build a web applicaton to replace an older system (that is currently dying, has no Pkey, no autoincrements on anything, etc).
Everything has been going very nicly, until I hit this smallish issue the main table in the application has approximatly 2.8 million rows (biggest I've delt with, I know it's actucally pretty small ). each of the rows is an item in storage, and contains 21 columns:
[A] [int]
IDENTITY(1,1) NOT NULL,
[int] NULL,
[C] [int] NOT NULL,
[D] [int] NOT NULL,
[E] [int] NULL,
[F] [int] NULL,
[G] [varchar](30) NULL,
[H] [bit] NOT NULL CONSTRAINT [DF_StorageItem_H] DEFAULT ((0)),
[varchar](30) NOT NULL CONSTRAINT [DF_StorageItem_I] DEFAULT (''),
[J] [varchar](30) NOT NULL CONSTRAINT [DF_StorageItem_J] DEFAULT (''),
[K] [smalldatetime] NOT NULL CONSTRAINT [DF_StorageItem_K] DEFAULT (getdate()),
[L] [int] NOT NULL CONSTRAINT [DF_StorageItem_L] DEFAULT ((0)),
[M] [int] NOT NULL CONSTRAINT [DF_StorageItem_M] DEFAULT ((0)),
[N] [varchar](500) NOT NULL CONSTRAINT [DF_StorageItem_N] DEFAULT (''),
[O] [varchar](250) NOT NULL CONSTRAINT [DF_StorageItem_O] DEFAULT (''),
[P] [varchar](30) NOT NULL CONSTRAINT [DF_StorageItem_P] DEFAULT (''),
[smalldatetime] NULL,
[R] [int] NOT NULL,
[CreatedDate] [smalldatetime] NOT NULL CONSTRAINT [DF_StorageItem_CreatedDate] DEFAULT (getdate()),
[LastModified] [smalldatetime] NOT NULL CONSTRAINT [DF_StorageItem_DateCreated] DEFAULT (getdate()),
[RowVersion] [timestamp] NOT NULL,
The problem occours when I want to sort / filter on the fields in this table, its taking up to 30 seconds to sort a single field. I'm running this on a Pentuim 4 3.0 with 1gig of ram (this is my dev enviroment) the production is 2x3.0 + 3 gig of ram. but ultimatly if it doesn't run well on my dev it wont last long in production
I think this is a perfect case for adding indexes but how should I set them up do I create one for every field the user is like to search on or do I create one that holds all the fields the user is likly to search on? Also for sorting do i need to create index that sort by DESC as well as ASC so that when sorting it works both ways?
I've renamed the columns just becuase our CEO is a tad paraniod about pretty much everything (happy to PM someone the fieldnames if it makes a huge diffrence)
PS. Also like to know if thiere is anything really bad about that table structure
Thanks For you Help!
Maxus
March 9, 2007 at 4:46 am
Maxus,
without knowing what you actually storing in this table it's hard to tell if you can optimize it or not. Anyhow at first site it looks ok, but the real question would be "how does your query look ?"
Are there any indexes at all on this table or not? How many records are in the table and how many you query for. Check the executon plan on what's going on and which steps take the most resources.
You might analyze your queries in the Index tuning Advisor. This should give you some suggestions on which columns to index.
Markus
[font="Verdana"]Markus Bohse[/font]
March 9, 2007 at 6:16 am
sadly there's no magic answer to this. single column indexes may help but often they are ignored. Basic index would include a primary key, a clustered index ( which may or may not be the PK ) and secondary indexes on columns used for joins ( foreign keys ) other than thta you need to capture the queries that hit the table and analyse that way.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
March 16, 2007 at 2:59 am
Hi Guys,
Thanks heaps for the answers, yep its definalty a odd one, okay so after further testing I worked what was slowing the query down.
I have the table I mentioned above
StorageItem Joins StorageItemType Joins StorageItemGroup
The query is:
SELECT TOP 25 * FROM StorageItem
INNER JOIN StorageItemType ON StorageItemType.Id = StorageItem.StorageItemTypeId
WHERE StorageItem.Status = X AND StorageItemType.StorageItemGroupID = Y
No matter what I did I couldn't get this query to be faster than (48sec) so I thought why not move the StorageItemGroupID field into the storageItem table.
So now its:
SELECT TOP 25 * FROM StorageItem
WHERE StorageItem.Status = X AND StorageItem.StorageItemGroupID = Y
This now returns things instantly becuase the index has both the status and the StorageItemGroupID, But the problem is it breaks 3rd normal form (which makes me die a little on the inside )
I'm baffled how to fix this one, without moving the data into the same table and having a fair amount of redundant information.
Any Ideas?
Thanks!
P.S. Reason it took me so long to reply is it didn't relise my thread had moved (thought it was just nuked)
March 19, 2007 at 12:52 pm
1) To address your original issue of the sort taking so long to respond, indexing won't help unless it is clustered, and that would then only help when sorting THAT specific column(s) and would possibly cause other performance issues. Lots of data will take lots of time to sort, and is completely dependent on Disk I/O capatilities, RAM and Number of CPUs (for parallel operations performance).
2) Do NOT be afraid to denormalize when appropriate for performance issues. One of the marks of a master performance tuner is the ability to recognize when/how to do this and when not.
3) Irregardless of 2), did you have appropriate indexing in the case when you were joining 3 tables?? By my estimate of the query's objective and writing, you should have indexes on StorageItem.StorageItemTypeId, StorageItem.Status, StorageItemType.StorageItemGroupID.
4) Try rewriting your join query thusly and see if it performs differently:
SELECT TOP 25 * FROM StorageItem
INNER JOIN StorageItemType ON StorageItemType.Id = StorageItem.StorageItemTypeId AND StorageItemType.StorageItemGroupID = Y
WHERE StorageItem.Status = X
5) As a minor nit-pick ALWAYS prefix your objects by their owner (v 2000-) or schema (2005+) and also use NOLOCK hints if the query is read-only and does not care about concurrency issues. Sorry, but I am pretty much fanatical about efficiency! :-))
TheSQLGuru
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
March 27, 2007 at 3:15 am
1) To address your original issue of the sort taking so long to respond, indexing won't help unless it is clustered, and that would then only help when sorting THAT specific column(s) and would possibly cause other performance issues. Lots of data will take lots of time to sort, and is completely dependent on Disk I/O capatilities, RAM and Number of CPUs (for parallel operations performance).
Yep Understood, I miss diagnosed the issue the sort was just being affected by the slowness of the query overall.
2) Do NOT be afraid to denormalize when appropriate for performance issues. One of the marks of a master performance tuner is the ability to recognize when/how to do this and when not.
As im still a bit of a newbie this makes sense and is what I thought, although I just need to hear someone else confirm it for me
3) Irregardless of 2), did you have appropriate indexing in the case when you were joining 3 tables?? By my estimate of the query's objective and writing, you should have indexes on StorageItem.StorageItemTypeId, StorageItem.Status, StorageItemType.StorageItemGroupID.
I had all those.
4) Try rewriting your join query thusly and see if it performs differently:
SELECT TOP 25 * FROM StorageItem
INNER JOIN StorageItemType ON StorageItemType.Id = StorageItem.StorageItemTypeId AND StorageItemType.StorageItemGroupID = Y
WHERE StorageItem.Status = X
This is an interesting suggestion but this query forms a view which the information is further filtered by so I can’t hard code the group Id
5) As a minor nit-pick ALWAYS prefix your objects by their owner (v 2000-) or schema (2005+) and also use NOLOCK hints if the query is read-only and does not care about concurrency issues. Sorry, but I am pretty much fanatical about efficiency! :-))
Yep for the sake of simplicity I removed the DBO.X.Y Stuff, but I always use the fully qualified name for my objects.
The issues always seem to occur when the join has to go to a third table. for instance:
A joins B - is fine
A Joins B Joins C - is slow as all hell.
So I'm still a bit baffled, I'm going to go back and triple check you suggestions other wise I'm not exactly sure how to resolve it
Thanks
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply