May 8, 2018 at 10:06 am
INSERT INTO #tt
SELECT DISTINCT dr.terr_grp_id,dr.terry_cd,dl.deal_line,dl.revs_no FROM deal_right dr
inner join deal_line dl on dr.deal_id=dl.deal_id and dr.revs_no=dl.revs_no
and dr.deal_line_no=dl.deal_line_no
inner join #deal dt on dt.deal_id=dl.deal_id and dt.revs_no=dl.revs_no
where dl.title_status_cd NOT IN('NS','REJ','W')
The above SQL is running for more than 20 mints.
#deal table will have around 19k records
deal_right will have around 40M records
deal_line is having around 8M records
Please help me in fine tuning the above SQL.
Please let me know if you need more info, so that I can post it.
May 8, 2018 at 12:15 pm
You will need to give more details for the best answer. Look at what Indexes are on each table and provide that info. Table definitions, and some sample data would also help.
With that being said you are joining to #deal but you are not returning any rows from that table, based on your sample code I would say this is not needed.
Here is my stab in the dark at it. This is just making the 'deal_line' a sub query, hoping that the WHERE will drop the number of rows needed to join.INSERT INTO #tt
SELECT dr.terr_grp_id, dr.terry_cd, DL2.deal_line, DL2.revs_no
FROM deal_right AS dr
INNER JOIN (SELECT dl.deal_line, dl.revs_no, dl.deal_id, dl.deal_line_no
FROM deal_line AS dl
WHERE dl.title_status_cd NOT IN ('NS','REJ','W')
GROUP BY dl.deal_line, dl.revs_no, dl.deal_id, dl.deal_line_no) AS DL2
ON dr.deal_id = DL2.deal_id
AND dr.revs_no = DL2.revs_no
AND dr.deal_line_no = DL2.deal_line_no
GROUP BY dr.terr_grp_id, dr.terry_cd, DL2.deal_line, DL2.revs_no
;
-------------------------------------------------------------
we travel not to escape life but for life not to escape us
Don't fear failure, fear regret.
May 9, 2018 at 4:34 am
Best of all, post the actual execution plan so we can understand how the optimizer is resolving your query and how well the statistics match your data.
One immediate point, why a DISTINCT operator? That's an aggregation operation and is frequently costly. Most data, if correctly structured with enforced referential constraints, shouldn't need it. That's the first thing I'd look to eliminate from the query to enhance performance.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
May 9, 2018 at 6:08 am
Dear Experts,
Index scan is happening for the below SQL. Is there a way to change it to index seek . Also attached the execution plan. avt_deal_restriction is having around 12M records and # lines have .6M records.
The final output is around 7500 after grouping. Please help me.
INSERT #terr_grp(terr_grp_id,deal_id,revs_no)
SELECT
a.terr_grp_id
,A.deal_id,
A.revs_no
FROM dbo.avt_deal_restriction AS a
INNER JOIN #lines dt
ON a.deal_id = dt.deal_id
AND a.revs_no = dt.revs_no
and a.deal_line_no=dt.deal_line_no
WHERE NOT EXISTS (SELECT
1
FROM #terr_grp
WHERE #terr_grp.terr_grp_id = a.terr_grp_id)
group by a.terr_grp_id
,A.deal_id,
A.revs_no
May 9, 2018 at 7:28 am
archana.mandiriappan - Wednesday, May 9, 2018 6:08 AMDear Experts,
Index scan is happening for the below SQL. Is there a way to change it to index seek . Also attached the execution plan. avt_deal_restriction is having around 12M records and # lines have .6M records.
The final output is around 7500 after grouping. Please help me.
INSERT #terr_grp(terr_grp_id,deal_id,revs_no)
SELECT
a.terr_grp_id
,A.deal_id,
A.revs_no
FROM dbo.avt_deal_restriction AS a
INNER JOIN #lines dt
ON a.deal_id = dt.deal_id
AND a.revs_no = dt.revs_no
and a.deal_line_no=dt.deal_line_no
WHERE NOT EXISTS (SELECT
1
FROM #terr_grp
WHERE #terr_grp.terr_grp_id = a.terr_grp_id)
group by a.terr_grp_id
,A.deal_id,
A.revs_no
Wow, looks a lot different than original post, but OK.
1. Is there an index on terr_grp_id on the avt_deal table? Can you add one?
2. Have you tested without joining to the #lines? Is it faster? # rows returned? You may want to add an index to this temp table for the fields you are joining on.
3. Maybe try and eliminate the rows not needed from the avt_deal table before joining to #lines. (maybe the SQL is smart enough to do that, but i would test it)
4. How many rows on the #terr table? any indexes?
-------------------------------------------------------------
we travel not to escape life but for life not to escape us
Don't fear failure, fear regret.
May 9, 2018 at 8:37 am
So you're moving 3.3 million rows filtered down after the JOINs, etc., from 120 million rows. In short, you're moving close to 3% of the data. That's actually not usually selective enough to see a seek operation being useful in most cases. Selectivity to make a seek useful is usually down around .1% or less (although, sometimes, it's different, varies based on the volume, the query, etc., etc.). It's down to the optimizer choosing to use sequential I/O (scan) versus random I/O (seek). When dealing with large data volumes, the scan can work better. You are absolutely in a position of large data volumes.
The way the query is written, there is no filtering on the 120 million row table. You're counting on the aggregation of the other table on what is effectively just a join, as was already stated, to do the elimination. I agree with @below86. Try it using a JOIN.
Also, be sure the statistics are up to date on everything. That #terr_grp table only seems to have a single row.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
May 9, 2018 at 2:03 pm
As always, clustering key is the most critical thing.
Do you (almost) always specify the deal_id when reading the dbo.avt_deal_restriction table? If so, cluster that table on ( deal_id, revs_no, deal_line_no ) (or put deal_line_no first, depending on how you query the table and how many rows of each type there are).
Cluster the #lines table the same way: ( deal_id, revs_no, deal_line_no ).
That should give you a MERGE join, which is fastest when it's proper to use.
For #terr_grp, for just an exists, a nonclus index on terr_grp_id is fine.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
May 9, 2018 at 7:44 pm
You're also using a GROUP BY as a replacement for DISTINCT. That usually means that the criteria is insufficient for the job.
Also, what you posted is an estimated execution plan, which could be a bit different from the actual execution plan. For performance problems, we need a bit more. Please see the article at the second link in my signature line below under "Helpful Links".
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply