October 1, 2008 at 5:27 pm
I have a Live Data table that contains Session information for our website. It gets inserted into an average of 50,000 times a day. It can have a maximum of 10 days worth of data and then the previous week's data is archived.
There are 3 non-clustered indexes on this table. One on the Date Entered column, another on a integer ID column, and the last on both the Date Entered and integer ID column.
Reporting was originally setup to report from this table (through a view). However, this is causing a major issue with Blocked Processes. The reporting queries appear to block the websites from entering in the table. The issue seems to be centered around the index on the Date column.
I understand that I can replicate this table to another location and have reporting access that table; however, that was met with much resistance because they are used to immediate reporting. I want to know if anyone if its possible to get this to work without the excessive blocking that occurs or has the table just gotten to busy to have any indexes on it? Thanks for your suggestions.
October 1, 2008 at 9:37 pm
First of all, can't the reporting queries include the nolock option or use the read uncommitted isolation level?
Indexing properly for the reports could be one solution. They could also be pulling back too much data and running too costly of a query for that environment. There needs to be a decision as to what has ultimate priority, the sessions being inserted from your site visitors or the reports pulling the data. My guess it that they will say both. At which point you can look at ways to properly index for reporting with the understanding that with each index added you are going to be adding work to the inserts as the indexes will need to be maintained.
Really isn't a great scenario. Moving the data off using transactional replication and then having them report off that might be another solution.
David
@SQLTentmaker“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot
October 2, 2008 at 5:40 am
I agree with the last post, it doesn't sound like an optimal approach. Transactional replication is usually only, at most, a few seconds behind the replicated database. Is this a stock exchange app where milliseconds count?
You could post some of the execution plans for the selects and the inserts so that we can get a better idea where the bottlenecks are. I notice you didn't say anything about a clustered index on the table. That would mean this is a heap table? Since you're indexing a heap table, you do know that SQL Server is maintaining a hidden column to identify the records for the pointers within the index. This can be much more expensive than maintaining a clustered index. You might want to identify a good candidate for the cluster.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
October 3, 2008 at 5:27 pm
The Gurus have already given their advices and i totally agree with them.
But just a thought which comes to my mind, if you are reporting off your production database throughout the day, then that means those reporting queries are going to eat up some of your IO resources which you cannot control beyond an extent.
One thing which you can control is how many processors do these queries engage on execution. See if making these reporting queries run with MAXDOP ( 1 ) works for you i.e. they are allowed to use only one processor.
Just a thought 🙂
October 5, 2008 at 7:18 pm
You cannot have reporting running real time off your session DB in production.
I think one week is way to long before archiving that data. We archive ours every five minutes. This will make everything run faster except the reports. The reports has to merge the active records while also accessing the archive DB and merging. Change the report to have some logic when to hit archive vs live and trim the fat faster. Make your archive process a soft delete every five minutes during the day then hard delete to archive a night. Indexes will need to change.
If this is a money making site for your company, the decision makers saying that real time access to reporting on session data(???) outweighs users accessing the site (as they are blocked from entering their session data) needs to be removed from command.
October 6, 2008 at 7:05 am
Eric Inman (10/5/2008)
If this is a money making site for your company, the decision makers saying that real time access to reporting on session data(???) outweighs users accessing the site (as they are blocked from entering their session data) needs to be removed from command.
Give them the chance to pony up the dollars to make it a viable requirement first though! 😉
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
October 8, 2008 at 6:59 am
chad (10/1/2008)
There are 3 non-clustered indexes on this table. One on the Date Entered column, another on a integer ID column, and the last on both the Date Entered and integer ID column.
Look at the usage of these three indexes (dm_db_index_usage_stats). Because the columns in the first index and the second index are also covered in the third index, changes are that not all indexes are used. If it's not used, drop it.
October 10, 2008 at 11:14 pm
Ditto. Try using a clustered index with the columns listed covering the index and compare the execution plans before making a decision. Make sure that you specify a reasonable value for the fill factor to prevent page splitting (Rule of thumb: Fill factor 70 to 80 for write intensive, 80 to 90 for read intensive tables, neither cast in stone) and turn on PAD_INDEX.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply