August 24, 2007 at 9:37 am
tempDB is a good unexplored avenue here actually. MS recommmends one file per CPU for tempdb, preferably on separate I/O systems. Fragmentation could be an issue here too since most people never mess with tempdb sizing/growth.
Again, this is mostly overall performance affecting stuff.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
August 24, 2007 at 10:16 pm
I don't know if this could be it or not, but have you checked for corruption of the database? There are a couple of checks you can run with DBCC. Also do you have a test platform (hardware) and/or database that you could test on, such as blowing away the tables then rebuilding and repopulating them (BCP might be your friend for this part)?
I'm thinking possible corruption because of the sudden nature of the performance hit. I had a DB once that regularly became corrupted until the vendor of the app finally found the problem and fixed it (actually I think it was a hardware configuration error causing our problem, the people who set-up the server had write caching turned on - not good). And the corruption caused odd things, like slow queries to occur.
Good luck,
Kenney
August 26, 2007 at 12:18 pm
Kenney - was your problem on a 6.5- sql server installation, or a box without battery backups for the server, disk subsystem and raid controller cache? It was my understanding that MS has been recommending write-back cache for optimal I/O performance for quite some time for systems with appropriate battery backups.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
August 26, 2007 at 5:36 pm
It was SQL 2000 in a power controlled data center. The raid card had a battery for the cache but, if I recall, it might have been suspected has being defective (I'm not sure since I wasn't fully involved). Since then I've heard a couple of times not to use write cache on a DB (maybe because something like this might happen - even if it is rare)
Regardless, I still think it's a good idea to check for corruption due to the sudden onset of the problem.
Thanks,
Kenney
August 27, 2007 at 10:52 am
We have six production databases on that server and run DBCC CheckDB each Wednesday evening on them, with no errors, so I don't see how the database could be corrupted. Plus, the slow response only happens in the one query in the one database. Maybe a rusty spot on the disk? A little 00 sandpaper might do the trick.
August 27, 2007 at 11:11 am
Dave, I absolutely LOVE your Signature Phrase!!!!
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
August 27, 2007 at 3:00 pm
Dave,
You mention that you believe that your index strategy is sound, but nowhere throughout this thread do I see that you’ve verified that your execution plan is optimal. Have you attempted to use SQL Profiler to capture the execution plan for your query? It is important that you use Profiler to do this and not just rely on a QA execution plan from your test DB as they could be drastically different due to the data distribution. Also, I don’t see where you’ve answered SQLGuru’s questions on index fragmentation. Have you run DBCC SHOWCONTIG? Are you regularly rebuilding indexes on the tables in the query?
August 28, 2007 at 10:43 am
John, all good suggestions. On that execution plan, bad news: there's no test DB. We are poor like churchmice, and only have the production database. The indexes on both tables are rebuilt every weekend when we get data updates from the mother ship. I ran SHOWCONTIG again just now.
DBCC
SHOWCONTIG scanning 'INVOICE' table...
Table
: 'INVOICE' (754162244); index ID: 1, database ID: 6
TABLE
level scan performed.
- Pages Scanned................................: 696658
-
Extents Scanned..............................: 87342
-
Extent Switches..............................: 87341
-
Avg. Pages per Extent........................: 8.0
-
Scan Density [Best Count:Actual Count].......: 99.70% [87083:87342]
-
Logical Scan Fragmentation ..................: 0.85%
- Extent Scan Fragmentation ...................: 1.42%
- Avg. Bytes Free per Page.....................: 132.3
-
Avg. Page Density (full).....................: 98.36%
DBCC SHOWCONTIG scanning 'TRANSAC' table...
Table: 'TRANSAC' (281607612); index ID: 1, database ID: 6
TABLE level scan performed.
- Pages Scanned................................: 5037149
- Extents Scanned..............................: 631518
- Extent Switches..............................: 631517
- Avg. Pages per Extent........................: 8.0
- Scan Density [Best Count:Actual Count].......: 99.70% [629644:631518]
- Logical Scan Fragmentation ..................: 0.15%
- Extent Scan Fragmentation ...................: 0.42%
- Avg. Bytes Free per Page.....................: 297.1
- Avg. Page Density (full).....................: 96.33%
Do these numbers look reasonable to you? Thanks for your help.
August 28, 2007 at 10:49 am
They do look good, but I still think it is extremely important to capture the execution plan using profiler.
September 5, 2007 at 10:45 am
I ran just this portion of the original query; it selects data that is then inserted into a table. It runs in 3 min. 24 sec.
SELECT i.MEDICAL_GROUP
, t.POST_PD
, i.PROV
, SUM(t.PAY_AMT) AS GL_ENTRY
FROM SMFIDXDM.dbo.INVOICE AS i INNER JOIN
SMFIDXDM.dbo.TRANSAC AS t ON i.GROUP_ZZ = t.GROUP_ZZ
AND i.PATIENT = t.PATIENT
AND i.INVOICE = t.INVOICE
WHERE
((t.PAY_CODE_NUM IN ('600', '17')) AND (t.POST_PD >= 10701) AND (i.MEDICAL_GROUP IN (27)))
or
((t.PAY_CODE_NUM IN ('600', '17')) AND (t.POST_PD >= 10601) AND (i.MEDICAL_GROUP IN (1)) AND (NOT (i.LOC IN (125, 169, 113, 172, 222, 220, 453, 475))) AND (i.DIV IN (22, 7, 10, 16, 18, 9, 12, 17, 65, 35, 6)))
GROUP BY i.MEDICAL_GROUP, i.PROV, t.POST_PD
HAVING (SUM(t.PAY_AMT) <> 0))
Here is the execution plan. I don't understand why it says Clustered Index Seek on Transac table and Clustered Index Scan on Invoice table when neither table has a clustered index??? The primary keys are nonclustered.
Execution Tree -------------- Parallelism(Gather Streams) |--Filter(WHERE[Expr1002]<>0.00)) |--Compute Scalar(DEFINE[Expr1002]=If ([globalagg1004]=0) then NULL else [globalagg1006])) |--Hash Match(Aggregate, HASH.[MEDICAL_GROUP], .[PROV], [t].[POST_PD]), RESIDUAL(.[MEDICAL_GROUP]=.[MEDICAL_GROUP] AND .[PROV]=.[PROV]) AND [t].[POST_PD]=[t].[POST_PD]) DEFINE[globalagg1004]=SUM([partialagg1003]), [globalagg1006]=SUM([partialagg1005]))) |--Parallelism(Repartition Streams, PARTITION COLUMNS.[MEDICAL_GROUP], .[PROV], [t].[POST_PD])) |--Hash Match(Partial Aggregate, HASH.[MEDICAL_GROUP], .[PROV], [t].[POST_PD]), RESIDUAL(.[MEDICAL_GROUP]=.[MEDICAL_GROUP] AND .[PROV]=.[PROV]) AND [t].[POST_PD]=[t].[POST_PD]) DEFINE[partialagg1003]=COUNT_BIG([t].[PAY_AMT]), [partialagg1005]=SUM([t].[PAY_AMT]))) |--Hash Match(Inner Join, HASH[t].[GROUP_ZZ], [t].[PATIENT], [t].[INVOICE])=(.[GROUP_ZZ], .[PATIENT], .[INVOICE]), RESIDUAL(([t].[GROUP_ZZ]=.[GROUP_ZZ] AND [t].[PATIENT]=.[PATIENT]) AND [t].[INVOICE]=.[INVOICE]) AND (([t].[POST_PD]>=10701 AND .[MEDICAL_GROUP]=27) OR (((((((((([t].[POST_PD]>=10601 AND .[MEDICAL_GROUP]=1) AND .[LOC]<>475) AND .[LOC]<>453) AND .[LOC]<>220) AND .[LOC]<>222) AND .[LOC]<>172) AND .[LOC]<>113) AND .[LOC]<>169) AND .[LOC]<>125) AND ((((((((((.=6 OR .=35) OR .=65) OR .=17) OR .=12) OR .=9) OR .=18) OR .=16) OR .=10) OR .=7) OR .=22))))) |--Bitmap(HASH[t].[GROUP_ZZ], [t].[PATIENT], [t].[INVOICE]), DEFINE[Bitmap1007])) | |--Parallelism(Repartition Streams, PARTITION COLUMNS[t].[GROUP_ZZ], [t].[PATIENT], [t].[INVOICE])) | |--Clustered Index Seek(OBJECT[SMFIDXDM].[dbo].[TRANSAC].[TRANS_POST_PD] AS [t]), SEEK[t].[POST_PD] >= 10601), WHERE[t].[PAY_CODE_NUM]=17 OR [t].[PAY_CODE_NUM]=600) ORDERED FORWARD) |--Parallelism(Repartition Streams, PARTITION COLUMNS.[GROUP_ZZ], .[PATIENT], .[INVOICE]), WHEREPROBE([Bitmap1007])=TRUE)) |--Filter(WHERE.[MEDICAL_GROUP]=27 OR (((((((((.[MEDICAL_GROUP]=1 AND .[LOC]<>475) AND .[LOC]<>453) AND .[LOC]<>220) AND .[LOC]<>222) AND .[LOC]<>172) AND .[LOC]<>113) AND .[LOC]<>169) AND .[LOC]<>125) AND ((((((((((.=6 OR .=35) OR .=65) OR .=17) OR .=12) OR .=9) OR .=18) OR .=16) OR .=10) OR .=7) OR .=22)))) |--Clustered Index Scan(OBJECT[SMFIDXDM].[dbo].[INVOICE].[INV_CRE_PD] AS ))
September 5, 2007 at 11:35 am
As t.PAY_CODE_NUM seems to be an int, it may be worth replacing t.PAY_CODE_NUM IN ('600', '17') with t.PAY_CODE_NUM IN (600, 17) in order
to stop the implicit conversion. (Might help with covering indexes.)
Parallelism can sometimes cause problems so it may also be worth trying OPTION(MAXDOP 1).
September 5, 2007 at 1:46 pm
1) Yep, you should ALWAYS use the CORRECT DATATYPE in TSQL. It really can make an INCREDIBLE difference in performance!!
2) In addition to MAXDOP 1, if you have 4+ CPU CORES (not that hyperthreaded crap), consider simply upping your parallelization threshold to 10 or 20 or so. 5 is WAY too low of a default for that setting IMHO.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
September 6, 2007 at 10:00 am
Thanks Ken and Guru. You're right about the datatype. This is legacy code and has been that way for quite a while, and I never noticed the mismatch. The server has 4 CPUs. What is the syntax for MAXDOP in the query? BOL doesn't give an example. Thanks.
September 6, 2007 at 10:45 am
in the index for BOL, type in "OPTION clause", without double quotes.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
Viewing 14 posts - 16 through 28 (of 28 total)
You must be logged in to reply to this topic. Login to reply