April 9, 2014 at 10:02 am
Hi All ,
We are runing select statement which read the data , which spawns many threads all into CXPACKET wait with either pagelatchio_EX and _SH , Please suggest to improve the performance by checking the plan.
SELECT
B.I_CV,
B.I_AMBEST_CO_NUM,
B.N_POL,
B.D_DRIVER_AGE,
B.D_ENTITY_ID,
B.STG_CV_Subject_Row_ID,
A.D_TRNS_EFF AS BIZ_DT,
A.I_SUBM,
A.C_LOB,
B.N_SUBJ,
B.C_REL_POLHLD,
B.M_LST,
B.M_FST,
B.M_MID,
B.M_SFX,
B.D_BRTH,
B.N_SSN_MASKED,
B.C_GEND,
B.N_DRV_LIC_NUM_MASKED,
B.C_ST_DRV_LIC_ST,
B.C_DRV_REL_INSRD,
B.M_UNPRSD,
B.F_NM_TYP,
B.Subject_Checksum,
A.D_POL_TRM_BGN,
A.D_POL_EXPIR,
row_number() over (partition by b.Subject_Checksum order by b.STG_CV_Subject_Row_ID desc) as MAX_ROW_ID_CHK_FLAG
FROM DEP_STG.STG_CV_AutoPolicyDetails A
INNER JOIN DEP_STG.STG_CV_SubjectDetails B
ON A.I_CV=B.I_CV AND A.N_POL=B.N_POL
order by a.I_AMBEST_CO_NUM asc, a.N_POL asc, b.D_ENTITY_ID asc, a.H_TRNS desc
________________________________________
This email is intended solely for the recipient. It may contain privileged, proprietary or confidential information or material. If you are not the intended recipient, please delete this email and any attachments and notify the sender of the error.
Cheers,
Got an idea..share it !!
DBA_Vishal
April 9, 2014 at 10:31 am
Assuming your row estimates are vaguely accurate and you're selecting 821 million rows without any predicates (WHERE clause), sorting it twice (once for row_number and once for the order by clause) and joining it to another 392 million row table (requiring a hash join), the only real way to speed it up would be throwing hardware at it.
1) What hardware spec is this running on?
2) Do you actually need all rows in the final results, or could you filter earlier? Where are these 821 million rows going?
3) Please post the Actual Execution plan (as opposed to the Estimated one you've provided)
April 9, 2014 at 10:52 am
Table definitions, index definitions and actual execution plan please.
The root cause it that your query is pulling a hell of a lot of data. Is it all necessary?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 9, 2014 at 12:54 pm
With no WHERE clause and lots of data AND 2 sorts, your only hope for this from a query execution perspective that I can think of is to a) set up your tables to use a MERGE join instead of a HASH join or b) remove one of the sorts. Outside of that you are likely spilling to tempdb for both the HASH join and the sorts so having lots of RAM and faster IO (especially for tempdb) - i.e. better hardware - is only option.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
April 9, 2014 at 10:12 pm
This is warehouse environment, we are pulling the subset and then will move this to warehouse table. I did capture this execution plan using dm_exec_sql_text and dm_exec_query_plan.
we have a SAN box with main three LUNS one for STG, one for DW and one for TEMP DB with RAID 10 configuration. Memory 768 GB. We recently set the maxdop to 8 and added 8 datafiles to temp db , and have 32 cpu.
Let me know if any best practices I need to follow up?
Cheers,
Got an idea..share it !!
DBA_Vishal
April 10, 2014 at 2:58 am
There may be some marginal gains to be had by some optimising indexes etc. to support the sorts and joins (possibly at the expense of an OLTP application's performance), but we can't suggest much as you haven't given any table structure/index definitions/actual execution plan.
Given what you're using this for, I'd be focussing my time on reducing the data set to only changed rows rather than these marginal benefits.
E.g. using a last modified time column or if there isn't one, requesting/creating one or using CDC to identify changes and making sure indexes cover the predicate. That's where the massive performance improvements will come from.
April 10, 2014 at 10:28 am
There are umpteen things that need to be done/set to properly configure SQL Server. Same for your server itself, and the SAN and san connectivity too. There are also a ton of things that you need to monitor to set up a baseline for performance and then to compare current performance to past to see if something is out-of-whack.
These things cannot truly be effectively covered in a few forum posts.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply