March 8, 2012 at 10:59 am
I have a query that ran in 3 seconds on a 2008 server, then we moved the database and everything to a different 2008R2 server (backup, restore) and now the query takes 2 hours and 15 minutes to run. I'm trying to figure out what could cause the change in performance. The servers have similar configurations (4 processors, 32GB RAM, etc.). The only difference is that the original server was physical and the new server is VM. The new VM server has been running quite a while and has no other issues. Any suggestions on what to look for or what could be happening are very welcome.
Owner & Principal SQL Server Consultant
Im Your DBA, Inc.
https://www.imyourdba.com/
March 8, 2012 at 11:23 am
Please add more info about the particular query.
March 8, 2012 at 11:45 am
SQLJocky (3/8/2012)
I have a query that ran in 3 seconds on a 2008 server, then we moved the database and everything to a different 2008R2 server (backup, restore) and now the query takes 2 hours and 15 minutes to run. I'm trying to figure out what could cause the change in performance. The servers have similar configurations (4 processors, 32GB RAM, etc.). The only difference is that the original server was physical and the new server is VM. The new VM server has been running quite a while and has no other issues. Any suggestions on what to look for or what could be happening are very welcome.
I'm assuming you've done the standard, check the execution plans, update statistics, etc?
If so, and you're relatively sure your problem is VM particular and not that the database just needs some upkeep, make sure your data is being stored in RAW data partitions. I've seen that seriously hose up a VM system. Otherwise, you need to get in there with perfmon and find your bottleneck.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
March 8, 2012 at 1:17 pm
Aneesh Soman (3/8/2012)
Please add more info about the particular query.
SELECT mbm1.BMWHS
,mbm1.BPROD AS BPROD_L1
,iimx.IPROD AS BCHLD
,iimx.IDESC
,iimx.IITYP
,iimx.IUMS
,(cmf.Ingredients+cmf.Packaging) AS COST
,((ISNULL(mbm1.BQREQ,1))/ISNULL(cic1.ICBTCH,1))
*((ISNULL(mbm2.BQREQ,1))/ISNULL(cic2.ICBTCH,1))
*((ISNULL(mbm3.BQREQ,1))/ISNULL(cic3.ICBTCH,1))
*((ISNULL(mbm4.BQREQ,1))/ISNULL(cic4.ICBTCH,1))
*((ISNULL(mbm5.BQREQ,1))/ISNULL(cic5.ICBTCH,1))
*((ISNULL(mbm6.BQREQ,1))/ISNULL(cic6.ICBTCH,1))
AS QTYREQ_NOSCRAP
,((ISNULL(mbm1.BQREQ,1)*ISNULL(mbm1.BMSCP,1))/ISNULL(cic1.ICBTCH,1))
*((ISNULL(mbm2.BQREQ,1)*ISNULL(mbm2.BMSCP,1))/ISNULL(cic2.ICBTCH,1))
*((ISNULL(mbm3.BQREQ,1)*ISNULL(mbm3.BMSCP,1))/ISNULL(cic3.ICBTCH,1))
*((ISNULL(mbm4.BQREQ,1)*ISNULL(mbm4.BMSCP,1))/ISNULL(cic4.ICBTCH,1))
*((ISNULL(mbm5.BQREQ,1)*ISNULL(mbm5.BMSCP,1))/ISNULL(cic5.ICBTCH,1))
*((ISNULL(mbm6.BQREQ,1)*ISNULL(mbm6.BMSCP,1))/ISNULL(cic6.ICBTCH,1))
AS QTYREQ
,((ISNULL(mbm1.BQREQ,1)*ISNULL(mbm1.BMSCP,1))/ISNULL(cic1.ICBTCH,1))
*((ISNULL(mbm2.BQREQ,1)*ISNULL(mbm2.BMSCP,1))/ISNULL(cic2.ICBTCH,1))
*((ISNULL(mbm3.BQREQ,1)*ISNULL(mbm3.BMSCP,1))/ISNULL(cic3.ICBTCH,1))
*((ISNULL(mbm4.BQREQ,1)*ISNULL(mbm4.BMSCP,1))/ISNULL(cic4.ICBTCH,1))
*((ISNULL(mbm5.BQREQ,1)*ISNULL(mbm5.BMSCP,1))/ISNULL(cic5.ICBTCH,1))
*((ISNULL(mbm6.BQREQ,1)*ISNULL(mbm6.BMSCP,1))/ISNULL(cic6.ICBTCH,1))
*(cmf.Ingredients+cmf.Packaging) AS EXTCOST
FROM (SELECT * FROM stg.Table1 mbm WHERE mbm.BMBOMM=' ' AND mbm.BDDIS =99999999 AND mbm.BPROD = '14324' AND mbm.BMWHS ='5AF') mbm1
LEFT JOIN (SELECT * FROM stg.Table1 mbm WHERE mbm.BMBOMM=' ' AND mbm.BDDIS =99999999) mbm2
ON mbm1.BCHLD = mbm2.BPROD AND mbm1.BMWHS = mbm2.BMWHS
LEFT JOIN (SELECT * FROM stg.Table1 mbm WHERE mbm.BMBOMM=' ' AND mbm.BDDIS =99999999) mbm3
ON mbm2.BCHLD = mbm3.BPROD AND mbm2.BMWHS = mbm3.BMWHS
LEFT JOIN (SELECT * FROM stg.Table1 mbm WHERE mbm.BMBOMM=' ' AND mbm.BDDIS =99999999) mbm4
ON mbm3.BCHLD = mbm4.BPROD AND mbm3.BMWHS = mbm4.BMWHS
LEFT JOIN (SELECT * FROM stg.Table1 mbm WHERE mbm.BMBOMM=' ' AND mbm.BDDIS =99999999) mbm5
ON mbm4.BCHLD = mbm5.BPROD AND mbm4.BMWHS = mbm5.BMWHS
LEFT JOIN (SELECT * FROM stg.Table1 mbm WHERE mbm.BMBOMM=' ' AND mbm.BDDIS =99999999) mbm6
ON mbm5.BCHLD = mbm6.BPROD AND mbm5.BMWHS = mbm6.BMWHS
LEFT JOIN stg.Table2 cic1
ON mbm1.BMWHS=cic1.ICFAC AND mbm1.BPROD=cic1.ICPROD
LEFT JOIN stg.Table2 cic2
ON mbm2.BMWHS=cic2.ICFAC AND mbm2.BPROD=cic2.ICPROD
LEFT JOIN stg.Table2 cic3
ON mbm3.BMWHS=cic3.ICFAC AND mbm3.BPROD=cic3.ICPROD
LEFT JOIN stg.Table2 cic4
ON mbm4.BMWHS=cic4.ICFAC AND mbm4.BPROD=cic4.ICPROD
LEFT JOIN stg.Table2 cic5
ON mbm5.BMWHS=cic5.ICFAC AND mbm5.BPROD=cic5.ICPROD
LEFT JOIN stg.Table2 cic6
ON mbm6.BMWHS=cic6.ICFAC AND mbm6.BPROD=cic6.ICPROD
JOIN stg.Table3 iimx
ON COALESCE(mbm6.BCHLD,mbm5.BCHLD,mbm4.BCHLD,mbm3.BCHLD,mbm2.BCHLD,mbm1.BCHLD) = iimx.IPROD
JOIN (SELECT cmf.CFPROD, iim.IDESC, iim.IITYP, iim.ICLAS, iic.ICDES, iim.IUMS, iim.IWGHT, cmf.CFFAC, ZMF.MFDESC, cmf.CFCSET
,SUM(CASE WHEN CFCBKT = 1 THEN CFTLVL + CFPLVL ELSE 0 END) AS Ingredients
,SUM(CASE WHEN CFCBKT = 2 THEN CFTLVL + CFPLVL ELSE 0 END) AS Packaging
,SUM(CASE WHEN CFCBKT = 3 THEN CFTLVL + CFPLVL ELSE 0 END) AS Resale
,SUM(CASE WHEN CFCBKT = 4 THEN CFTLVL + CFPLVL ELSE 0 END) AS OutsideProc
,SUM(CASE WHEN CFCBKT = 10 THEN CFTLVL + CFPLVL ELSE 0 END) AS DirectLabor
,SUM(CASE WHEN CFCBKT = 20 THEN CFTLVL + CFPLVL ELSE 0 END) AS IndirectLabor
,SUM(CASE WHEN CFCBKT = 21 THEN CFTLVL + CFPLVL ELSE 0 END) AS VariableOH
,SUM(CASE WHEN CFCBKT = 30 THEN CFTLVL + CFPLVL ELSE 0 END) AS FixedOH
,SUM(CASE WHEN CFCBKT = 31 THEN CFTLVL + CFPLVL ELSE 0 END) AS RMPlantWhseStorage
,SUM(CASE WHEN CFCBKT = 32 THEN CFTLVL + CFPLVL ELSE 0 END) AS InterplantFreight
,SUM(CASE WHEN CFCBKT NOT IN (0, 1, 2, 3, 4, 10, 20, 21, 30, 31, 32)
THEN CFTLVL + CFPLVL ELSE 0 END) AS AllOther
,SUM(CASE WHEN CFCBKT = 0 THEN CFTLVL + CFPLVL ELSE 0 END) AS Total
,SUM(CASE WHEN CFCBKT = 1 THEN CFTLVL ELSE 0END) AS TL_Ingredients
,SUM(CASE WHEN CFCBKT = 2 THEN CFTLVL ELSE 0END) AS TL_Packaging
,SUM(CASE WHEN CFCBKT = 3 THEN CFTLVL ELSE 0END) AS TL_Resale
,SUM(CASE WHEN CFCBKT = 4 THEN CFTLVL ELSE 0END) AS TL_OutsideProc
,SUM(CASE WHEN CFCBKT = 10 THEN CFTLVL ELSE 0 END) AS TL_DirectLabor
,SUM(CASE WHEN CFCBKT = 20 THEN CFTLVL ELSE 0 END) AS TL_IndirectLabor
,SUM(CASE WHEN CFCBKT = 21 THEN CFTLVL ELSE 0 END) AS TL_VariableOH
,SUM(CASE WHEN CFCBKT = 30 THEN CFTLVL ELSE 0 END) AS TL_FixedOH
,SUM(CASE WHEN CFCBKT = 31 THEN CFTLVL ELSE 0 END) AS TL_RMPlantWhseStorage
,SUM(CASE WHEN CFCBKT = 32 THEN CFTLVL ELSE 0 END) AS TL_InterplantFreight
,SUM(CASE WHEN CFCBKT NOT IN (0, 1, 2, 3, 4, 10, 20, 21, 30, 31, 32)
THEN CFTLVL ELSE 0END) AS TL_AllOther
,SUM(CASE WHEN CFCBKT = 0 THEN CFTLVL ELSE 0END) AS TL_Total
FROM stg.Table4 AS cmf
JOIN stg.Table3 AS iim ON cmf.CFPROD = iim.IPROD
JOIN stg.Table5 AS iic ON iim.ICLAS = iic.ICLAS
JOIN stg.Table6 zmf ON cmf.CFFAC = ZMF.MFFACL
GROUP BY cmf.CFPROD, cmf.CFCSET, iim.IDESC, iim.IITYP, iim.ICLAS, iic.ICDES
,iim.IUMS, cmf.CFFAC, ZMF.MFDESC, iim.IWGHT) cmf
ON mbm1.BMWHS = cmf.CFFAC
AND COALESCE(mbm6.BCHLD,mbm5.BCHLD,mbm4.BCHLD,mbm3.BCHLD,mbm2.BCHLD,mbm1.BCHLD) = cmf.CFPROD
AND CFCSET = 3
The bottom two joins are the issue I've found. If you comment out either of them at once the query will run in seconds. If they are both in there then it runs for hours. I've updated stats and the indexes are fairly good.
Owner & Principal SQL Server Consultant
Im Your DBA, Inc.
https://www.imyourdba.com/
March 8, 2012 at 1:29 pm
Can you please check in another SQL window with Actual Execution Plan on, the time taken and plan for running a normal select * on the bottom 2 tables, join on each bottom table with the top table. Also find out whether the I/O or the CPU is the culprit for these from the Profiler. If the storage environment has got changed while migrating it may cause some issue. Also you can try RECOMPILE Hint in case it is using some invalid plan. I am not sure whether the backup and Restore can cause the Inner Hash Join to be used by optimizer, may be cleaning the cache may help a bit.
Your query looks complex though.:)
March 9, 2012 at 4:16 am
Are you able to post Execution Plans?
March 11, 2012 at 2:37 pm
Here is the current execution plan
Owner & Principal SQL Server Consultant
Im Your DBA, Inc.
https://www.imyourdba.com/
March 11, 2012 at 8:41 pm
I've seen it happen where indexes / exec plans got "stale" after backup/restore; SQL totally ignoring existance of indexes, even though they're there and "should" be used.
I'd start with updating statistics, if this doesn't help ( honestly don't remember if it was enough in my case) - rebuild indexes will definitely do the trick.
March 12, 2012 at 6:15 am
rafael lenartowicz (3/11/2012)
I've seen it happen where indexes / exec plans got "stale" after backup/restore; SQL totally ignoring existance of indexes, even though they're there and "should" be used.I'd start with updating statistics, if this doesn't help ( honestly don't remember if it was enough in my case) - rebuild indexes will definitely do the trick.
I did update stats and rebuild the indexes...didn't help. Also I tried running it again last night and it filled up the tempdb.
Owner & Principal SQL Server Consultant
Im Your DBA, Inc.
https://www.imyourdba.com/
March 12, 2012 at 8:23 am
First glance at the execution plan made me jumped on my chair. The estimated rows to be returned are 1,527,070,000,000 and the estimated rows size is 195,147,877 MB :w00t: Is it what you really want????? (You mentioned that statistics are uptodate) Are you sure the data is the same? What is the execution plan on the other server?
March 12, 2012 at 10:01 am
SQLJocky (3/8/2012)
I have a query that ran in 3 seconds on a 2008 server, then we moved the database and everything to a different 2008R2 server (backup, restore) and now the query takes 2 hours and 15 minutes to run. I'm trying to figure out what could cause the change in performance. The servers have similar configurations (4 processors, 32GB RAM, etc.). The only difference is that the original server was physical and the new server is VM. The new VM server has been running quite a while and has no other issues. Any suggestions on what to look for or what could be happening are very welcome.
1) You moved from SQL 2008 to SQL 2008R2. There are query optimizer differences that could be responsible for this regression. Is your 2008R2 server patched up to latest SP/CU?
2) That is a VERY complex query. It is not at all surprising that the optimizer didn't get things right. Best would be to refactor it to have one or more intermediate sets to keep the optimizer from "going out to lunch".
3) I see you updated stats with full scan and it didn't help. Bummer.
4) did you do a wait stats and IO stall analysis during this execution? VMs can be HORRIBLE performers when improperly provisioned/setup - which is the vast majority of the time at clients I visit. :w00t:
I think the complexity of this query goes well beyond what you can reasonably expect to address on a forum staffed by people who volunteer their time. Get a performance tuning professional on board to help you is my recommendation.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
March 12, 2012 at 3:31 pm
SQLJocky,
I know exactly what is wrong with your query. We had the similar issue before.
If you look at the plan you'll see a nested loops without a JOIN predicate. In this particular case it doesn't mean that some JOINs are missing. What really happens is the optimizer decides to do CROSS JOIN instead of JOIN with conditions and apply a filter only a few operations later. That's why performance is so bad and you see these billions of estimated rows.
Why is it doing so? If we take a closer look at Filter operator in the plan we'll see this horrible predicate:
CONVERT_IMPLICIT(nvarchar(35),
CASE WHEN [Expr1040] IS NOT NULL THEN [Expr1040] ELSE
CASE WHEN [Expr1032] IS NOT NULL THEN [Expr1032] ELSE
CASE WHEN [Expr1024] IS NOT NULL THEN [Expr1024] ELSE
CASE WHEN [Expr1016] IS NOT NULL THEN [Expr1016] ELSE
CASE WHEN [Expr1008] IS NOT NULL THEN [Expr1008] ELSE
[BSNA_STG].[stg].[V83BPCSF_MBM].[BCHLD] as [mbm].[BCHLD]
END
END
END
END
END,0)
=
[BSNA_STG].[stg].[V83BPCSF_CMF].[CFPROD] as [cmf].[CFPROD]
AND
CASE WHEN [Expr1040] IS NOT NULL THEN [Expr1040] ELSE
CASE WHEN [Expr1032] IS NOT NULL THEN [Expr1032] ELSE
CASE WHEN [Expr1024] IS NOT NULL THEN [Expr1024] ELSE
CASE WHEN [Expr1016] IS NOT NULL THEN [Expr1016] ELSE
CASE WHEN [Expr1008] IS NOT NULL THEN [Expr1008] ELSE
[BSNA_STG].[stg].[V83BPCSF_MBM].[BCHLD] as [mbm].[BCHLD]
END
END
END
END
END
=
[BSNA_STG].[stg].[V83BPCSF_IIM].[IPROD] as [iimx].[IPROD]
it was generated based on the following join conditions from the query:
COALESCE(mbm6.BCHLD,mbm5.BCHLD,mbm4.BCHLD,mbm3.BCHLD,mbm2.BCHLD,mbm1.BCHLD) = iimx.IPROD
COALESCE(mbm6.BCHLD,mbm5.BCHLD,mbm4.BCHLD,mbm3.BCHLD,mbm2.BCHLD,mbm1.BCHLD) = cmf.CFPROD
The optimizer simply cannot do a good estimation and proper join based on this stuff. So my suggestion would be to get rid of this join by re-writing the query/adding temp tables etc.
Conclusion: never put functions on the columns which are used in JOIN conditions. Sometimes it even worse than functions on the columns in a WHERE clause.
March 13, 2012 at 6:35 pm
I figured out that it was actually being caused by indexing.
On the old server, there were several of the tables that were heaps with non-clustered indexes. I went through and optimized for the execution plan, not realizing that it was actually detrimental to SQL Optimizer. After adding clustered indexes and optimizing, it was far far worse and almost seems like the optimizer was doing a cartesian join.
I dropped the tables and recreated from the old server with the same indexes and it runs perfectly fine. I would love it if someone could point me to why that happened so I can be educated.
Thansk!
Owner & Principal SQL Server Consultant
Im Your DBA, Inc.
https://www.imyourdba.com/
March 13, 2012 at 7:14 pm
How do you expect someone to explain it to you if we don't know what exactly have you changed and how the execution plan looks now? In general, by changing indexes you are changing the costs and since the optimizer is cost-based therefore the plan is changing. I have pointed out already what is the root cause of your issue but looks like you are happy with your temporary solution. Why temporary? Because it may be just the matter of time when the query will start to perform poorly again.
March 13, 2012 at 7:49 pm
Alexander Suprun (3/13/2012)
How do you expect someone to explain it to you if we don't know what exactly have you changed and how the execution plan looks now? In general, by changing indexes you are changing the costs and since the optimizer is cost-based therefore the plan is changing. I have pointed out already what is the root cause of your issue but looks like you are happy with your temporary solution. Why temporary? Because it may be just the matter of time when the query will start to perform poorly again.
Well my request for explanation is more about why it is sometimes detrimental to add a clustered index to a table. I was in a Microsoft Optimization and Tuning class where we briefly touched on it but didn't go into any detail. Fortunately, it runs fine now that I realized what caused it to break. Unfortunately, it is very difficult to determine how to rewrite this inherited query and still keep the intended logic and result set.
Owner & Principal SQL Server Consultant
Im Your DBA, Inc.
https://www.imyourdba.com/
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply