February 1, 2007 at 11:52 am
When you have a SQL Server database with a fill factor of 50%, it's my understanding that each page will leave 50% of free space to allow for clustered indexing inserts.
To explain, if you have a clustered index on the column "Last Name" and then the data gets inserted, the data will physically get placed in the data page where it logically fits. If you have a FF of 100%, there will be no free space. However, if an entry needs to be physically inserted on data page 50 of 100, and the FF is 100%, all the pages from 50-100 have no physically free space. So, when you insert a record on page 50, every page will need to physically be modified to accommodate the insertion of that record. So, if you're data only has a clustered index on the PK which is incremented on every insert, it would probably be optimal to have a FF of around 90%. This will cut down on the amount of free space and the number of pages that get created.
If you have a database size specified, and a fill factor of only 50%, you will create a lot of free space in your data files that will never get filled.
That may or may not be a reason for the performance degradation, but it's something to keep in mind.
February 1, 2007 at 12:16 pm
I have tried fillfactors of 0 (default), 50% and 25% and I have not seen any change in performance, page splits or fragmentation. I've also tried removing all the indexes (except PRIMARY KEYs) and again I am seeing no difference in performance.
None of this makes any sense to me.
If our application is sending SQLs to the database using the same connection, will the SQLs be queued up and run serially or does SQL Server have an option to process the queries in parallel? I'm wondering if I am seeing poor performance because SQL Server is collecting a ever-increasing amount of SQL statements to process.
Cheers, Max
February 1, 2007 at 1:03 pm
Just to correct some earlier statements.
I have been told the server is a dual CPU with hyperthreading technology. Also, when I reported the high CPU for the SQL Server process, this was incorrect. I was using perfmon to report the CPU and it was showing me a high value, almost 100% at times BUT that was for only a single CPU. When I looked at Task Manager I saw that the actual overall CPU usage for SQL Server was much lower.
February 2, 2007 at 4:58 am
forget the fill factor, you'll not be helping yourself at all. There is a serious missunderstanding in how fill factors can be used or abused. A series of sequential inserts will always show page splits as the process of adding rows triggers that event each time a new page is allocated. If the PK is sequential and the inserts are sequential ( e.g. works like a heap ) then fill factor should be 100% for the table/clustered index, setting anything else will degrade performance and bloat the database. In this scenario the PK ( sequential key ) being clustered or otherwise is likely to have little effect ( as you discovered ) Secondary index population may cause degredation but it depends upon the actual index. for a single column integer each page will contain around 2k entries, again fragmentation will only occur if the inserts into the index are not sequential, setting a fill factor on the secondary indexes may help ( slightly ) . You can adjust the fillfactor by looking at the fragmentation afterwards. You need to test with and without the secondary indexes - I suspect overall you'll see little difference.
Data loads in sql server always work better in batches, in simple recovery issue checkpoint commands between batches.
To check your disk subsystem monitor io completion time , for data and logs, if this counter rises above 10ms then your disk subsystem is too slow.
You may have parallelism issues, check by adding maxdop hints to your queries or looking for cxpacket waits. I usually start with a maxdop = to the number of actual cpu sockets, then I try a value of 1.
Memory usage rise on sql server is to be expected if you're not using awe. without awe sql will use around 1.7Gb, 2.7Gb with the 3gb switch. Only worry if the memory used by sql exceeds the physical memory ( + some for the o/s )
I don't have an Oracle exp. so can't compare or make any observations from that point.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
February 2, 2007 at 8:05 am
Thanks, Colin.
Just to re-iterate, even though the test I am using is styled like a data load, it's not really a data load at all. It's just a high volume of data being loaded into the system to test the robustness of the application / database.
Thanks for your information about fill factors and page spits - very useful.
I've run CPU and IO checks already (using SQL Server queries as well as Performance Monitor counters) and there is no stress being shown. I've also checked the average length of the processor and device queues and they're both 0 so no process is waiting on resources.
There doesn't appear to be any memory issues. I just observed that the memory steadily climbed but not at an alarming rater. The more alarming thing WAS the increase in CPU but since I've discovered my method of measuting CPU (perfmon) was flawed, I don't think I need to worry about that.
I'll look into the MAXDOP. We use Kodo to generate queries for us so fine-tuning them for specific databases is a little problematic. I'll give it a try manually though.
Cheers, Max
February 2, 2007 at 11:00 am
Couple more comments/questions.
When you made the PKey nonclustered, did you make 1 of the alternate indexes clustered ? If not, that may actually worsen the situation and cause fragmentation. The issue with the clustered index wasn't so much that it was clustered, but that the data being inserted didn't follow the order of the index columns.
Regarding the inserts themselves are they simple inserts like this:
INSERT INTO Table (Columns)
VALUES (Constants)
Or are the stored proc calls ? Or are they INSERT INTO ... SELECT ?
If they are more than just simple inserts, can you capture query execution plans early in the process and compare to an execution plan later when things have started to degrade ?
Many successive inserts can cause index statistcis to become out of data, and if your inserts involve SELECTs with joins to other tables, the query optimizer may be selecting an expensive plan.
February 2, 2007 at 11:44 am
I have reverted the index situation back to how it was originally. The things I tried were:
1. PK nonclustered
2. no indexes except for PKs
3. various fillfactor sizes
None of these changes had any affect on performance.
The SQLs themselves are not stored procs. We stay away from anything that is database dependant e.g stored procs, triggers, etc. The SQLs are all of the simple variety e.g. INSERT INTO table VALUE (val1, val2).
You mentioned a very interesting point. In simple terms our application is saving events to the database. Each event of the type I am loading involves 7 SQLs (2 SELECTs, 1 UPDATE and 4 INSERTS). The UPDATE is not an application table, it's the sequence counter. When I compared Oracle to SQL Server performance, SQL Server is averagely BETTER than Oracle on 4 of the queries, Oracle is faster on the other 3. Where SQL Server is faster, it's only marginally faster but when it's slower, it's a LOT slower. Almost 3x slower. The queries that happen to be slower by the larget margin are SELECTs, not the INSERTs. I should quickly add that this latest test was a 15 minute run which is long enough to observe the performance drop off but not enough to see it drop off very significantly. I've just run another test that I ran for a lot longer. Now I have 1Gb trace file to work through (the last one was only 50Mb). So what it seems like I am seeing is that as the data load progresses, it's actually doing just fine except for the 1 SELECT SQL that is running much slower on SQL Server. I don't know why that is. It could be locking but the lock check queries that I have run didn't appear to show anything to worry about.
Cheers, Max
February 2, 2007 at 12:41 pm
>>actually doing just fine except for the 1 SELECT SQL that is running much slower on SQL Server
Is this SELECT hitting 1 or more of the tables that are also being inserted to during this process ?
If so, I would suspect the issue I mentioned above, with statistics getting out of date. Or possibly it's an index that is highly selective and usable for an index seek when the data volumes are small, but which the optimizer chooses not to use as the data distribution changes.
Either way, capturing the execution plan of this SELECT at the start, and comparing it to the plan later when it degrades is going to provide useful data on what's causing the slowdown.
February 2, 2007 at 1:09 pm
I'll see what I can dig up. Thanks.
February 2, 2007 at 2:16 pm
I think I've nailed the problem (although I haven't fixed it).
In short, we have SELECTs that are taking longer and longer to run. It's like index isn't being used (even though it should be). Perhaps query hints are in order. I'll see what I can do.
EDIT: In other words, what you said, "Either way, capturing the execution plan of this SELECT at the start, and comparing it to the plan later when it degrades is going to provide useful data on what's causing the slowdown."
February 2, 2007 at 4:29 pm
When I run this query:
I see this:
Full Scans/sec 597397
as I run this the number increases. I have only one database instance in use so I know that all the activity is related to my work. Is there a way to determine what table is being accessed via a full table scan? I have two table that are SELECTed from regularly. One has one row, the other is growing. I can assume it's the one row table that is the full scan but I would like to be 100% sure.
Thanks, Max
PS - running this workload through DTA resulted in no recommendations.
February 5, 2007 at 1:00 am
Perhaps query hints are in order. I'll see what I can do.
Recomended practice is to never add query hints unless you know exactly what you're doing and why you're adding the hints.
Have you checked the execution plan of the queries? Those will show you what tables are been scanned.
If could be that the indexes are too fragmented to be useful. It could be (quite likely) that they are not selective enough. It could be that the indexes don't satisfy the queries.
Could you post the queries that you have (especially the select ones) and the design of the tables in question (including their indexes) That'll be a great help in fixing this problem. Without, we're just shooting in the dark.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
February 5, 2007 at 7:50 am
I have manually checked the execution plans of the queries by running them in the Management Studio and according to the output I get, the SELECT should be using the index. The only reason I was suggesting hints at all is if the SELECT against table1 was using a full table scan since the first time it executes, table1 is empty. I don't know how often SQL Server reconsiders execution plans.
The SELECT is:
SELECT t0.col1,
t0.col2,
t0.col3,
t0.col4,
t0.col5,
t0.col6,
t0.col7,
t0.col8
FROM dbo.table1 t0
WHERE (t0.col8 IN (?))
col8 is a UNIQUE NONCLUSTERED INDEX.
The SELECT is working against one of the tables that is also being INSERTed into but I haven't noticed any blocking locks or deadlocks.
The INSERT statements look like this:
INSERT INTO dbo.table1
(col1, col2, col3, col4, col5, col6, col7, col8)
VALUES
(?, ?, ?, ?, ?, ?, ?, ?)
the INSERT above only runs if the record does not already exist. A pre-select is run rather than execute the INSERT and capture the unique constraint violation.
INSERT INTO dbo.table2 (col1, col2, col3, col4, col5, col6, col7, col8, col9, col10)
VALUES
(?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
the above INSERT is executed for every event.
INSERT INTO dbo.table3
(col1, col2, col3, col4, col5) VALUES
(?, ?, ?, ?, ?)
the above INSERT is executed for every event.
INSERT INTO dbo.table4
(col1, col2, col3, col4, col5, col6)
VALUES (?, ?, ?, ?, ?, ?)
the above INSERT is executed for about 1/12 of the events.
Even though I have changed the table and column names, the structure remains true. I've mentioned all the INDEXes that I think are relevant. The SELECT is the only query that is taking increasingly longer to run and it only accesses the table using ine column.
From my analysis the problem is definitely that one SELECT statement. Every other query that is being run takes the same time to execute, only the SELECT changes. I've been running the same test over the weekend so I am about to gather the timings for the last few events.
Cheers, Max
February 5, 2007 at 9:46 am
I don't know that changing the fill factor will help, I'm also concerned about the full tables scans where you expect index seeks. But I wondered if you rebuilt the indexes after changing the fill factor? The fill factor only applies when an index is created or rebuilt, changing it has no effect on normal operation.
February 5, 2007 at 11:09 pm
SELECT t0.col1,
t0.col2,
t0.col3,
t0.col4,
t0.col5,
t0.col6,
t0.col7,
t0.col8
FROM dbo.table1 t0
WHERE (t0.col8 IN (?))
That select will table scan if the number of records returned exceeds more than about 1% of the table. I can't tell whether the in is a single value (and if it is, you should be using =) or if it's multiple. SQL does this, because bookmark lookups on a significant portion of the table are very expensive. Since the NC is only on col8, SQL would have to do the search on the NC, then do a bookmark lookup t either the clustered index or the heap to retrieve col1 through col7. If a large number of boookmark lookups have to be done, it can be cheaper to simply scan the cluster/heap to retrieve the records.
Do you need all 8 columns returned? If not, remove the ones you don't and consider adding the other columns as included columns in the index.
What data type is col8? What is the data type of the paramter?
Execution plans are discarded when the statistics of the underlying table changes. You do have auto-update statistics on?
If you want to force recompiles on every exec (not something I'd recomend, but worth trying) then add (OPTION RECOMPILE) to the end of your query
You can try catching the execution plans from SQL profiler. I can't recall offhand the event, but there is one that will give the execution plan either when it's first cached or when it's discarded from the cache. That way you'll be able to see exactly what exec plans were actually used by the queries.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 16 through 30 (of 46 total)
You must be logged in to reply to this topic. Login to reply