August 12, 2013 at 1:59 am
Hello,
I've inherited a database from a co-worker and have now got to a stage with it where I'm trying to resolve some performance issues. One particular view seems to be causing a bottleneck for a number of processes and I would appreciate some help or advice on how to make it more efficient.
The view in question looks at a multiple tables, but I've identified that the problem lies with one particular section which I reproduce below:
SELECTCASE
WHEN jt.emp_num IS NOT NULL
THEN jt.emp_num
ELSE
(SELECT TOP 1 jtx.emp_num
FROM jobtran jtx(NOLOCK)
WHERE jtx.job = jt.job
ANDjtx.suffix = jt.suffix
AND jtx.oper_num >= jt.oper_num
ANDjtx.emp_num IS NOT NULL
AND jtx.trans_num < jt.trans_num
ORDER BY jtx.oper_num, jtx.trans_num DESC)
ENDAS 'emp_num'
FROMjobtran jt (NOLOCK)
WHEREjt.trans_class = 'J'
ANDISNULL(jt.posted,0) = 1
As you can see, the aim of this is to use the emp_num for a record should it exist otherwise try to find the emp_num from a later oper_num for the same job. The sub query is apparently where the problem is at. The execution plan says that the Top N Sort operation is costing 96% of the batch, so I guess this is the problem but how do I make it more efficient?
The dbo.jobtran table is large (2.5 million rows) and this is probably the root of the issue, but I hope there is a more efficient way to identify the appropriate emp_num as this currently takes 53 seconds to complete.
I've attached the execution plan for this statement and the dbo.jobtran definition in the hope that someone can offer some advice on speeding this puppy up.
If you require more detail then please let me know.
Thanks,
Mike
August 12, 2013 at 4:58 am
The main query filters on trans_class and posted, the subquery doesn't, which would make using ROW_NUMBER quite tricky.
I'd go for a new (covering) index to support the subquery, something like
job, suffix, oper_num, trans_num DESC INCLUDE emp_num.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 12, 2013 at 5:16 am
Adding the suggested index does half the time.
Presumably an index was suggested as there is no other obvious way to achieve the same result with greater efficiency?
August 12, 2013 at 5:32 am
Can you post the actual plan please Mike? There may well be scope for improving the index.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 12, 2013 at 6:08 am
Not sure which actual plan you wanted (pre or post) so I've attached both.
August 12, 2013 at 6:28 am
What other columns do you need from this table?
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 12, 2013 at 6:36 am
The main query (from which this was the identified 'slow' bit) uses; emp_num, trans_type, a_hrs, qty_moved, wc and CreateDate in the returned fields; job, suffix, wc, oper_num and trans_num in the JOINs and trans_class, posted in the WHERE clause.
If it would be helpful I can return the whole statement, I was trying to focus on the 'slow' bit in my initial post.
August 12, 2013 at 6:46 am
mike.dinnis (8/12/2013)
The main query (from which this was the identified 'slow' bit) uses; emp_num, trans_type, a_hrs, qty_moved, wc and CreateDate in the returned fields; job, suffix, wc, oper_num and trans_num in the JOINs and trans_class, posted in the WHERE clause.If it would be helpful I can return the whole statement, I was trying to focus on the 'slow' bit in my initial post.
It's worth a look.
It may well be worth investigating ROW_NUMBER because a large proportion of the output rows have a value from the subquery and you're having to scan the whole table (clustered index) in any case. If there’s a suitable index to get around the sort usually required by ROW_NUMBER then you could be on to a winner. It would be a whole lot easier for us if you could mock up some sample data. A single set based on job & suffix would do.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 12, 2013 at 6:55 am
Here's the full statement:
SELECTjt.trans_num,
CASE
WHEN jt.emp_num IS NOT NULL
THEN jt.emp_num
ELSE
(SELECT TOP 1 jtx.emp_num
FROM jobtran jtx(NOLOCK)
WHERE jtx.job = jt.job
ANDjtx.suffix = jt.suffix
AND jtx.oper_num >= jt.oper_num
ANDjtx.emp_num IS NOT NULL
AND jtx.trans_num < jt.trans_num
ORDER BY jtx.oper_num, jtx.trans_num DESC)
ENDAS 'emp_num',
CASE
WHENjt.trans_type = 'M' AND js3.sched_drv = 'L'-- M = Move
THENjt.a_hrs
ELSE0
ENDAS 'rework_hrs',
CASE
WHEN jt.trans_type IN ('R','C') AND js3.sched_drv = 'L'-- R = Run, C = Machine
THEN jt.a_hrs
ELSE 0
ENDAS 'run_hrs',
CASE
WHEN jt.trans_type = 'S' AND js3.sched_drv = 'L'-- S = Setup
THEN jt.a_hrs
ELSE 0
ENDAS 'setup_hrs',
CASE -- All std time captured on Move*, as this is where quantity is recorded
-- where Move trans types don't exist, qty is recorded on other trans types (e.g C)
WHEN ISNULL(jt.qty_moved,0) <> 0
AND jt2.trans_num IS NULL
THEN CASE
WHENi.cost_type = 'S'-- Standard Costed
THEN CASE
WHEN js.sched_drv = 'L' THEN js.setup_hrs + (js.run_mch_hrs * jt.qty_moved) + (js.run_lbr_hrs * jt.qty_moved)
ELSE 0
END
ELSECASE-- Actual Costed
WHEN js2.sched_drv = 'L' THEN js2.setup_hrs + (js2.run_mch_hrs * jt.qty_moved) + (js2.run_lbr_hrs * jt.qty_moved)
ELSE 0
END
END
WHEN ISNULL(jt.qty_moved,0) <> 0
THEN CASE
WHENi.cost_type = 'S'-- Standard Costed
THEN CASE
WHEN js.sched_drv = 'L' THEN(js.run_mch_hrs * jt.qty_moved) + (js.run_lbr_hrs * jt.qty_moved)
ELSE 0
END
ELSECASE-- Actual Costed
WHEN js2.sched_drv = 'L' THEN (js2.run_mch_hrs * jt.qty_moved) + (js2.run_lbr_hrs * jt.qty_moved)-- Actual Costed
ELSE 0
END
END
ELSE 0
END AS 'std_hours',
trans_date,
jt.wc,
i.Uf_renItemProjectAS 'project_no',
ISNULL(wc.Uf_renWCSalvage,0)AS 'salvage_wc',
jt.CreateDate
FROMjobtran jt (NOLOCK)
JOINjob j (NOLOCK)
ONj.job = jt.job
ANDj.suffix = jt.suffix
JOINitem i (NOLOCK)
ONi.item = j.item
JOINwc (NOLOCK)
ONwc.wc = jt.wc
LEFT JOINjrt_sch js (NOLOCK)
ONjs.job = i.job
ANDjs.suffix = 1
ANDjs.oper_num = jt.oper_num
LEFT JOINjrt_sch js2 (NOLOCK)
ONjs2.job = i.job
ANDjs2.suffix = 0
ANDjs2.oper_num = jt.oper_num
LEFT JOINjrt_sch js3 (NOLOCK)
ONjs3.job = jt.job
ANDjs3.suffix = jt.suffix
ANDjs3.oper_num = jt.oper_num
LEFT JOIN(SELECT MIN(trans_num) AS 'trans_num', -- Set up time only backflushed for first booking,
job,-- so check that no previous booking exists for op.
suffix,
oper_num
FROM jobtran (NOLOCK)
GROUP BY job, suffix, oper_num) jt2
ON jt2.job = jt.job
AND jt2.suffix = jt.suffix
AND jt2.oper_num = jt.oper_num
AND jt2.trans_num < jt.trans_num
WHEREjt.trans_class = 'J'
ANDISNULL(jt.posted,0) = 1-- Only show posted transactions
GO
The more I look at this code the more irregularities I spot. Oh, the joys of inheriting code!
I'll see what I can do about generating some sample data for the table.
August 12, 2013 at 7:08 am
I'd focus on the one table for now, which gives you this query:
SELECT
jt.trans_num,
emp_num = CASE
WHEN jt.emp_num IS NOT NULL
THEN jt.emp_num
ELSE
(SELECT TOP 1
jtx.emp_num
FROM jobtran jtx(NOLOCK)
WHERE jtx.job = jt.job
ANDjtx.suffix = jt.suffix
AND jtx.oper_num >= jt.oper_num
ANDjtx.emp_num IS NOT NULL
AND jtx.trans_num < jt.trans_num
ORDER BY jtx.oper_num, jtx.trans_num DESC)
END
FROM jobtran jt (NOLOCK)
LEFT JOIN (
SELECT
trans_num = MIN(trans_num), -- Set up time only backflushed for first booking,
job,-- so check that no previous booking exists for op.
suffix,
oper_num
FROM jobtran (NOLOCK)
GROUP BY job, suffix, oper_num
) jt2
ON jt2.job = jt.job
AND jt2.suffix = jt.suffix
AND jt2.oper_num = jt.oper_num
AND jt2.trans_num < jt.trans_num
WHERE jt.trans_class = 'J'
ANDISNULL(jt.posted,0) = 1-- Only show posted transactions
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 12, 2013 at 7:15 am
Here's some doctored INSERTS to populate the jobtran table. It covers two jobs which I think is sufficient for analysing the problem. Is this suitable?
August 12, 2013 at 7:17 am
Don't know where my last post went so I apologise if this is posted twice.
I've attached a set of INSERTS for the jobtran table representing some doctored data. I hope this is sufficient for analysing the performance. Let me know if you need more.
August 12, 2013 at 7:17 am
This, I think, would make a reasonable test query:
SELECT
jt.job,jt.suffix, jt.oper_num, jt.trans_num, jt.emp_num,
x.oper_num, x.trans_num, x.emp_num,
jt2.trans_num
FROM jobtran jt (NOLOCK)
OUTER APPLY (
SELECT TOP 1
jtx.emp_num, jtx.oper_num, jtx.trans_num
FROM jobtran jtx(NOLOCK)
WHERE jtx.job = jt.job
ANDjtx.suffix = jt.suffix
AND jtx.oper_num >= jt.oper_num
ANDjtx.emp_num IS NOT NULL
AND jtx.trans_num < jt.trans_num
ORDER BY jtx.oper_num, jtx.trans_num DESC
) x
OUTER APPLY ( -- Set up time only backflushed for first booking,
SELECT -- so check that no previous booking exists for op.
trans_num = MIN(trans_num)
FROM jobtran j (NOLOCK)
WHERE j.job = jt.job
AND j.suffix = jt.suffix
AND j.oper_num = jt.oper_num
) jt2
WHERE jt.trans_class = 'J'
ANDISNULL(jt.posted,0) = 1 -- Only show posted transactions
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 12, 2013 at 7:37 am
ChrisM@Work (8/12/2013)
I'd focus on the one table for now, which gives you this query:
SELECT
jt.trans_num,
emp_num = CASE
WHEN jt.emp_num IS NOT NULL
THEN jt.emp_num
ELSE
(SELECT TOP 1
jtx.emp_num
FROM jobtran jtx(NOLOCK)
WHERE jtx.job = jt.job
ANDjtx.suffix = jt.suffix
AND jtx.oper_num >= jt.oper_num
ANDjtx.emp_num IS NOT NULL
AND jtx.trans_num < jt.trans_num
ORDER BY jtx.oper_num, jtx.trans_num DESC)
END
FROM jobtran jt (NOLOCK)
LEFT JOIN (
SELECT
trans_num = MIN(trans_num), -- Set up time only backflushed for first booking,
job,-- so check that no previous booking exists for op.
suffix,
oper_num
FROM jobtran (NOLOCK)
GROUP BY job, suffix, oper_num
) jt2
ON jt2.job = jt.job
AND jt2.suffix = jt.suffix
AND jt2.oper_num = jt.oper_num
AND jt2.trans_num < jt.trans_num
WHERE jt.trans_class = 'J'
ANDISNULL(jt.posted,0) = 1-- Only show posted transactions
Why did you include the LEFT JOIN in this statement? It doesn't appear to be used anywhere.
Other than that the code is the same I posted in my original post.
Just curious as to how to diagnose these things for myself in the future.
August 12, 2013 at 7:40 am
It is referenced:
CASE -- All std time captured on Move*, as this is where quantity is recorded
-- where Move trans types don't exist, qty is recorded on other trans types (e.g C)
WHEN ISNULL(jt.qty_moved,0) <> 0
AND jt2.trans_num IS NULL
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply