August 29, 2011 at 12:34 pm
Hi,
My developer gave me a query as usual and told me that it runs for 3-4 hours sometimes and usually it runs for 1 hour:-D
This query uses a linked server to fetch some remote data(300k rows) and then processes it locally with some local tables. The remote execution looks very slow probably due to poor network between two cities.
His query's from clause is something like:
FROM
[Remote server].[DB].[DBO].[Table1] T1
Left join
[Remote server].[DB].[DBO].[Table2] T2
on T1.id=T2.id
Left join
[Local Server].[DB].[DBO].[T3] T3
on T3.batchno=T2.batchno
where
blah blah....
Does this situation counts under the famous RBRAR(Row by row..)
Is the processing of joins done by processing each row from linked server.
I plan to tweak the above code in a new way. How about importing those tables all at once using linked server(takes a minute) and then on this temp table perform my joins and index this temp table properly. After displaying the data, the temp table can be deleted.
Please share your valuable thoughts here.
TY
Chandan
August 29, 2011 at 12:45 pm
Try breaking that into two queries. Join the first two tables but insert the result into a temp table. Then join the temp table to the other local table and see if that won't improve it.
Another possibility you might consider replication of the two tables so they will always be local.
The probability of survival is inversely proportional to the angle of arrival.
August 29, 2011 at 12:49 pm
It's not RBAR (Row By Agonizing Row), it's just ugly.
Take this piece and run it on the foreign server, dump to a local temp with only the columns you need, and then continue processing:
FROM
[Remote server].[DB].[DBO].[Table1] T1
Left join
[Remote server].[DB].[DBO].[Table2] T2
on T1.id=T2.id
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
August 29, 2011 at 12:52 pm
Agreed.
Create/Change your stored proc to execute a stored proc on the remote server. The remote server stored proc will first truncate a local table, then insert the results of a query into the table.
Then on your local server, you can join against the table that the results were inserted into on the remote server.
August 29, 2011 at 1:10 pm
Hi,
Thanks for your kind attention.I am not a developer so sorry for asking stupid questions.
My query is a follows:
SELECT 11, tmp.AdjustedArrivalDate, sbp.info_Date, datediff(hh, tmp.AdjustedArrivalDate, sbp.info_Date) 'ActualTAT', 32,
case
when dateadd(hh, 32, tmp.AdjustedArrivalDate) > DATEADD(DAY, -[dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, sbp.info_Date, 'True', 32), sbp.info_Date)
then 1
when
dateadd(hh, 32, tmp.AdjustedArrivalDate) < DATEADD(DAY, -[dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, sbp.info_Date, 'True', 32), sbp.info_Date)
then 0
else -1
end as 'MetOrMissed', [dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, sbp.info_Date, 'True', 32) 'OffsetDays'
FROM
[Server_A].[DB1].[DBO].[SmartTuition_OCR_Document] OCR
Left join
[Server_A].[DB1].[DBO].[smartTuition_Info_ScanBatchProcessing] sbp
on OCR.scanBatchNum = sbp.scanBatchNum
Left join
[DB3].[DBO].[MailTray_Batch_Xref] mx on OCR.mailBatch = mx.BatchID
Left join [DB4].[DBO].[T#TMP#HolidayLogicResults#AllSLAs] tmp
on mx.MTID = tmp.UniqueIdentifier
where tmp.ArrivalDate between dateadd(day, datediff(day, 0, getdate()) - 120, 0) and
dateadd(day, datediff(day, 0, getdate()), 0)
and sbp.info_Date is not null and tmp.clientid = 4 and tmp.SLAID = 11 and tmp.SLATypeTable = 'T#DEV#SLACalculation#ElapsedTime' and tmp.HolidayLogicType = 'MailOpening' AND sbp.Processed = 1
Here the DB1 is on a remote server and DB3 and DB4 are on local server.
My idea was to fetch the required columns from DB1 tables and store them in two different temp tables locally. Then perform the join on these two temp tables and two local tables along with creating indexes on temp tables.
From your posts, I could make out that you guys want me to process the join between two remote tables on the remote server itself and fetch that on my local db engine. But in that case also, I am using linked server, so please explain me which way should i go.
Regards
Chandan
PS: Replication is not being considered as of now due to my inability to convince them to go for this:-)
August 29, 2011 at 1:24 pm
Short of using replication, you have no option but to use a linked server - your data exists on a remote location, so you need to fetch it somehow. I mean, you could probably do something weird like, export the data to a file from your remote server, and then read the file on the local one, but it's not worth it.
The reason why we're suggesting you do the work on the remote server is because it will be much faster than having to do a cross-server join. Your result set on the remote server will probably be much smaller, and querying it shouldn't be too problematic.
What you can do is create a temp table on the local server, and then read the values from the temp table in the remote server into your local one.
So, something like this:
On the remote server, create a table to store the results of your remote server query. Then create this procedure:
CREATE PROCEDURE [usp_SomeRemoteProcedureName]
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
TRUNCATE TABLE LocalTable
INSERT INTO LocalTable (info_date, mailBatch)
SELECT
sbp.info_Date,
OCR.mailBatch
FROM [Server_A].[DB1].[DBO].[SmartTuition_OCR_Document] OCR
Left join [Server_A].[DB1].[DBO].[smartTuition_Info_ScanBatchProcessing] sbp on OCR.scanBatchNum = sbp.scanBatchNum
and sbp.info_Date is not null
AND sbp.Processed = 1
END
GO
Then, on your local server, create this procedure:
CREATE PROCEDURE [usp_SomeProcName]
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
EXEC [Server_A].[DB1].[DBO].[usp_SomeRemoteProcedureName]
CREATE TABLE #LocalTable
(
ID INT IDENTITY PRIMARY KEY,
info_date SMALLDATETIME,
mailBatch INT
)
INSERT INTO #LocalTable (info_date, mailBatch)
SELECT
sbp.info_Date,
OCR.mailBatch
FROM [Server_A].[DB1].[DBO].LocalTable
SELECT
11,
tmp.AdjustedArrivalDate,
#LocalTable.info_Date,
datediff(hh, tmp.AdjustedArrivalDate, #LocalTable.info_Date) 'ActualTAT',
32,
(
case
when dateadd(hh, 32, tmp.AdjustedArrivalDate) > DATEADD(DAY, -[dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, #LocalTable.info_Date, 'True', 32), #LocalTable.info_Date)
then 1
when dateadd(hh, 32, tmp.AdjustedArrivalDate) < DATEADD(DAY, -[dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, #LocalTable.info_Date, 'True', 32), #LocalTable.info_Date)
then 0
else
-1
end
) as 'MetOrMissed',
[dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, #LocalTable.info_Date, 'True', 32) 'OffsetDays'
FROM #LocalTable
Left join [DB3].[DBO].[MailTray_Batch_Xref] mx on #LocalTable.mailBatch = mx.BatchID
Left join [DB4].[DBO].[T#TMP#HolidayLogicResults#AllSLAs] tmp on mx.MTID = tmp.UniqueIdentifier
wheretmp.ArrivalDate between dateadd(day, datediff(day, 0, getdate()) - 120, 0)
and dateadd(day, datediff(day, 0, getdate()), 0)
and tmp.clientid = 4
and tmp.SLAID = 11
and tmp.SLATypeTable = 'T#DEV#SLACalculation#ElapsedTime'
and tmp.HolidayLogicType = 'MailOpening'
END
GO
Couldn't test this since I don't have any table definitions or anything, but play around with it, you should be able to get it to work.
August 29, 2011 at 1:34 pm
kramaswamy (8/29/2011)
Short of using replication, you have no option but to use a linked server - your data exists on a remote location, so you need to fetch it somehow. I mean, you could probably do something weird like, export the data to a file from your remote server, and then read the file on the local one, but it's not worth it.The reason why we're suggesting you do the work on the remote server is because it will be much faster than having to do a cross-server join. Your result set on the remote server will probably be much smaller, and querying it shouldn't be too problematic.
What you can do is create a temp table on the local server, and then read the values from the temp table in the remote server into your local one.
So, something like this:
On the remote server, create a table to store the results of your remote server query. Then create this procedure:
CREATE PROCEDURE [usp_SomeRemoteProcedureName]
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
TRUNCATE TABLE LocalTable
INSERT INTO LocalTable (info_date, mailBatch)
SELECT
sbp.info_Date,
OCR.mailBatch
FROM [Server_A].[DB1].[DBO].[SmartTuition_OCR_Document] OCR
Left join [Server_A].[DB1].[DBO].[smartTuition_Info_ScanBatchProcessing] sbp on OCR.scanBatchNum = sbp.scanBatchNum
and sbp.info_Date is not null
AND sbp.Processed = 1
END
GO
Then, on your local server, create this procedure:
CREATE PROCEDURE [usp_SomeProcName]
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
EXEC [Server_A].[DB1].[DBO].[usp_SomeRemoteProcedureName]
CREATE TABLE #LocalTable
(
ID INT IDENTITY PRIMARY KEY,
info_date SMALLDATETIME,
mailBatch INT
)
INSERT INTO #LocalTable (info_date, mailBatch)
SELECT
sbp.info_Date,
OCR.mailBatch
FROM [Server_A].[DB1].[DBO].LocalTable
SELECT
11,
tmp.AdjustedArrivalDate,
#LocalTable.info_Date,
datediff(hh, tmp.AdjustedArrivalDate, #LocalTable.info_Date) 'ActualTAT',
32,
(
case
when dateadd(hh, 32, tmp.AdjustedArrivalDate) > DATEADD(DAY, -[dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, #LocalTable.info_Date, 'True', 32), #LocalTable.info_Date)
then 1
when dateadd(hh, 32, tmp.AdjustedArrivalDate) < DATEADD(DAY, -[dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, #LocalTable.info_Date, 'True', 32), #LocalTable.info_Date)
then 0
else
-1
end
) as 'MetOrMissed',
[dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, #LocalTable.info_Date, 'True', 32) 'OffsetDays'
FROM #LocalTable
Left join [DB3].[DBO].[MailTray_Batch_Xref] mx on #LocalTable.mailBatch = mx.BatchID
Left join [DB4].[DBO].[T#TMP#HolidayLogicResults#AllSLAs] tmp on mx.MTID = tmp.UniqueIdentifier
wheretmp.ArrivalDate between dateadd(day, datediff(day, 0, getdate()) - 120, 0)
and dateadd(day, datediff(day, 0, getdate()), 0)
and tmp.clientid = 4
and tmp.SLAID = 11
and tmp.SLATypeTable = 'T#DEV#SLACalculation#ElapsedTime'
and tmp.HolidayLogicType = 'MailOpening'
END
GO
Couldn't test this since I don't have any table definitions or anything, but play around with it, you should be able to get it to work.
Thank you for this wonderful explanation. Will post the changes and results tomorrow.
Regards
chandan
August 29, 2011 at 1:46 pm
Actually looking over your queries again, I noticed something.
For your queries against the remote server, you have a LEFT JOIN against smartTuition_Info_ScanBatchProcessing.
However, later on in the query, you are saying: "and sbp.info_Date is not null"
You could just remove that query clause, and change the join to be an INNER JOIN instead, unless I'm misunderstanding your queries. In that case your procedure on the remote server would change to:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [usp_SomeRemoteProcedureName]
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
TRUNCATE TABLE LocalTable
INSERT INTO LocalTable (info_date, mailBatch)
SELECT
sbp.info_Date,
OCR.mailBatch
FROM [Server_A].[DB1].[DBO].[SmartTuition_OCR_Document] OCR
JOIN [Server_A].[DB1].[DBO].[smartTuition_Info_ScanBatchProcessing] sbp on OCR.scanBatchNum = sbp.scanBatchNum AND sbp.Processed = 1
END
GO
August 29, 2011 at 1:49 pm
Obviosuly, everything is SQL seems to be on a case-by-case basis. But, anytime you are using linked servers, you never know what the optimizer will do. I'm not sure if this is the case here, but when joining to linked servers, the optimizer will sometimes bring the entire table locally (tempdb) and then do it's joins. So, if you are dealing with large tables, linked servers can have major performance impacts. A lot of people see performance issues with linked servers and they aren't aware of the impact of doing cross-server joins, so just FYI..
Hopefully, the sugestions provided by others will help you. If not, you might need to replicate or ETL your data to the local server.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply