November 26, 2015 at 3:57 am
Hi frnds,
I have one DFT which is having straight source to destination mapping.In Source I am using a T- SQL Which is using Left outer Join.As Source query output 20 millions + rows hence I an using batch wise data load to avoid buffer issue. The issue is DFT is taking very much time to load single batch.
I have done below changes to improve the performance
1. Used column names instead of "*" in Select list.
2. Used Fast Load option in Oledb destination
3. connection manager's packet size is set to 32767
4. Used batch wise data load
I want a solution where i can improve the DFT performance.
November 26, 2015 at 4:08 am
Have you tested the individual components of the flow to see where the bottleneck is? How long does it take to run the query locally? How long does it take to copy a comparable amount of data across the network between the source and destination servers? And how long does a simple INSERT (with the same amount of data) into the destination table take?
John
November 26, 2015 at 4:14 am
John Mitchell-245523 (11/26/2015)
Have you tested the individual components of the flow to see where the bottleneck is? How long does it take to run the query locally? How long does it take to copy a comparable amount of data across the network between the source and destination servers? And how long does a simple INSERT (with the same amount of data) into the destination table take?John
This is good advice :smooooth:
If the query runs slowly outside of SSIS, it won't magically speed up when included in a package.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
November 26, 2015 at 7:18 am
Hi,
The source query is taking very much time to produce the data.
below is a sample query used in source component.
tbl_documents table is having 700 rows and tbl_documentdetails table is having 1.6 millions+ rows.
I have tried using inner join instead of exists for DocumentCodeList table it is giving me same perfomanace issue.
SELECT
a.*
,b.value1
,b.value2
FROM
tbl_documents AS a
LEFT OUTER JOIN
dbo.tbl_documentdetails b WITH(NOLOCK)
ON
a.invoiceid=b.invoiceid
AND
a.orderId=b.orderid
AND
a.billid=b.billid
WHERE
EXISTS
(
SELECT
c.invoice
FROM
DocumentCodeList c
WHERE
a.invoiceid = c.invoiceid
AND
b.srnumber BETWEEN 1 AND 100
)
November 26, 2015 at 7:25 am
Re1 (11/26/2015)
Hi,The source query is taking very much time to produce the data.
below is a sample query used in source component.
tbl_documents table is having 700 rows and tbl_documentdetails table is having 1.6 millions+ rows.
I have tried using inner join instead of exists for DocumentCodeList table it is giving me same perfomanace issue.
SELECT
a.*
,b.value1
,b.value2
FROM
tbl_documents AS a
LEFT OUTER JOIN
dbo.tbl_documentdetails b WITH(NOLOCK)
ON
a.invoiceid=b.invoiceid
AND
a.orderId=b.orderid
AND
a.billid=b.billid
WHERE
EXISTS
(
SELECT
c.invoice
FROM
DocumentCodeList c
WHERE
a.invoiceid = c.invoiceid
AND
b.srnumber BETWEEN 1 AND 100
)
Have you tried getting an execution plan and optimising the query?
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
November 26, 2015 at 7:35 am
Yes I have used execution plan and added missing index also.
The query is taking to much time in Nested Loops(Left outer join).
November 27, 2015 at 1:50 am
The percentages in a plan are not times, they're estimated cost (emphasis *estimated*)
Can you post the plan and the table and index definitions?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 27, 2015 at 7:28 am
does the bit below effectively make the left join an inner join?
b.srnumber BETWEEN 1 AND 100
Also, for anyone else who struggles to read unformatted sql.
SELECT
a.*
, b.value1
, b.value2
FROM
tbl_documents AS a
LEFT OUTER JOIN dbo.tbl_documentdetails AS b WITH
( NOLOCK
) ON a.invoiceid = b.invoiceid
AND a.orderId = b.orderid
AND a.billid = b.billid
WHERE EXISTS
(
SELECT
c.invoice
FROM
DocumentCodeList AS c
WHERE a.invoiceid = c.invoiceid
AND b.srnumber BETWEEN 1 AND 100
);
November 27, 2015 at 4:41 pm
It looks like that Left Join is essentially an inner join... you are filtering the results back down AFTER you are first putting together the bigger recordset. Make this
tbl_documents AS a
INNER JOIN dbo.tbl_documentdetails AS b --WITH(NOLOCK)
ON a.invoiceid = b.invoiceid
AND a.orderId = b.orderid
AND a.billid = b.billid
AND b.srnumber BETWEEN 1 AND 100
Having Clustered indexes on the ID columns are a good bet as well.
----------------------------------------------------
November 27, 2015 at 9:14 pm
Thanks for Ur reply MMartin1
I have tried with inner join also, still the same performance issues I am facing.
I have added clustered index on invoiceid,orderid,billid column on tbl_documentdetails table and added clustered index on srnumber coumn on DocumentCodeList table.
November 30, 2015 at 1:16 pm
Try to get the actual query plan attached here. A couple of people have mentioned that before, plus your table definitions. That will go further in helping us to help you.
----------------------------------------------------
December 2, 2015 at 1:17 am
/* try this - it may not make a difference but it might help you find the bottleneck */
;with cte as
(
select a.*
from tbl_documents AS a
inner join DocumentCodeList AS c
WHERE a.invoiceid = c.invoiceid
)
select cte.*,
, b.value1
, b.value2
from cte
LEFT OUTER JOIN dbo.tbl_documentdetails AS b with (nolock)
on a.invoiceid = b.invoiceid
AND a.orderId = b.orderid
AND a.billid = b.billid
where b.srnumber >= 1 AND b.srnumber <= 100
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply