Is it RBRAR???

  • Hi,

    My developer gave me a query as usual and told me that it runs for 3-4 hours sometimes and usually it runs for 1 hour:-D

    This query uses a linked server to fetch some remote data(300k rows) and then processes it locally with some local tables. The remote execution looks very slow probably due to poor network between two cities.

    His query's from clause is something like:

    FROM

    [Remote server].[DB].[DBO].[Table1] T1

    Left join

    [Remote server].[DB].[DBO].[Table2] T2

    on T1.id=T2.id

    Left join

    [Local Server].[DB].[DBO].[T3] T3

    on T3.batchno=T2.batchno

    where

    blah blah....

    Does this situation counts under the famous RBRAR(Row by row..)

    Is the processing of joins done by processing each row from linked server.

    I plan to tweak the above code in a new way. How about importing those tables all at once using linked server(takes a minute) and then on this temp table perform my joins and index this temp table properly. After displaying the data, the temp table can be deleted.

    Please share your valuable thoughts here.

    TY

    Chandan

  • Try breaking that into two queries. Join the first two tables but insert the result into a temp table. Then join the temp table to the other local table and see if that won't improve it.

    Another possibility you might consider replication of the two tables so they will always be local.

    The probability of survival is inversely proportional to the angle of arrival.

  • It's not RBAR (Row By Agonizing Row), it's just ugly.

    Take this piece and run it on the foreign server, dump to a local temp with only the columns you need, and then continue processing:

    FROM

    [Remote server].[DB].[DBO].[Table1] T1

    Left join

    [Remote server].[DB].[DBO].[Table2] T2

    on T1.id=T2.id


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Agreed.

    Create/Change your stored proc to execute a stored proc on the remote server. The remote server stored proc will first truncate a local table, then insert the results of a query into the table.

    Then on your local server, you can join against the table that the results were inserted into on the remote server.

  • Hi,

    Thanks for your kind attention.I am not a developer so sorry for asking stupid questions.

    My query is a follows:

    SELECT 11, tmp.AdjustedArrivalDate, sbp.info_Date, datediff(hh, tmp.AdjustedArrivalDate, sbp.info_Date) 'ActualTAT', 32,

    case

    when dateadd(hh, 32, tmp.AdjustedArrivalDate) > DATEADD(DAY, -[dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, sbp.info_Date, 'True', 32), sbp.info_Date)

    then 1

    when

    dateadd(hh, 32, tmp.AdjustedArrivalDate) < DATEADD(DAY, -[dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, sbp.info_Date, 'True', 32), sbp.info_Date)

    then 0

    else -1

    end as 'MetOrMissed', [dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, sbp.info_Date, 'True', 32) 'OffsetDays'

    FROM

    [Server_A].[DB1].[DBO].[SmartTuition_OCR_Document] OCR

    Left join

    [Server_A].[DB1].[DBO].[smartTuition_Info_ScanBatchProcessing] sbp

    on OCR.scanBatchNum = sbp.scanBatchNum

    Left join

    [DB3].[DBO].[MailTray_Batch_Xref] mx on OCR.mailBatch = mx.BatchID

    Left join [DB4].[DBO].[T#TMP#HolidayLogicResults#AllSLAs] tmp

    on mx.MTID = tmp.UniqueIdentifier

    where tmp.ArrivalDate between dateadd(day, datediff(day, 0, getdate()) - 120, 0) and

    dateadd(day, datediff(day, 0, getdate()), 0)

    and sbp.info_Date is not null and tmp.clientid = 4 and tmp.SLAID = 11 and tmp.SLATypeTable = 'T#DEV#SLACalculation#ElapsedTime' and tmp.HolidayLogicType = 'MailOpening' AND sbp.Processed = 1

    Here the DB1 is on a remote server and DB3 and DB4 are on local server.

    My idea was to fetch the required columns from DB1 tables and store them in two different temp tables locally. Then perform the join on these two temp tables and two local tables along with creating indexes on temp tables.

    From your posts, I could make out that you guys want me to process the join between two remote tables on the remote server itself and fetch that on my local db engine. But in that case also, I am using linked server, so please explain me which way should i go.

    Regards

    Chandan

    PS: Replication is not being considered as of now due to my inability to convince them to go for this:-)

  • Short of using replication, you have no option but to use a linked server - your data exists on a remote location, so you need to fetch it somehow. I mean, you could probably do something weird like, export the data to a file from your remote server, and then read the file on the local one, but it's not worth it.

    The reason why we're suggesting you do the work on the remote server is because it will be much faster than having to do a cross-server join. Your result set on the remote server will probably be much smaller, and querying it shouldn't be too problematic.

    What you can do is create a temp table on the local server, and then read the values from the temp table in the remote server into your local one.

    So, something like this:

    On the remote server, create a table to store the results of your remote server query. Then create this procedure:

    CREATE PROCEDURE [usp_SomeRemoteProcedureName]

    AS

    BEGIN

    -- SET NOCOUNT ON added to prevent extra result sets from

    -- interfering with SELECT statements.

    SET NOCOUNT ON;

    TRUNCATE TABLE LocalTable

    INSERT INTO LocalTable (info_date, mailBatch)

    SELECT

    sbp.info_Date,

    OCR.mailBatch

    FROM [Server_A].[DB1].[DBO].[SmartTuition_OCR_Document] OCR

    Left join [Server_A].[DB1].[DBO].[smartTuition_Info_ScanBatchProcessing] sbp on OCR.scanBatchNum = sbp.scanBatchNum

    and sbp.info_Date is not null

    AND sbp.Processed = 1

    END

    GO

    Then, on your local server, create this procedure:

    CREATE PROCEDURE [usp_SomeProcName]

    AS

    BEGIN

    -- SET NOCOUNT ON added to prevent extra result sets from

    -- interfering with SELECT statements.

    SET NOCOUNT ON;

    EXEC [Server_A].[DB1].[DBO].[usp_SomeRemoteProcedureName]

    CREATE TABLE #LocalTable

    (

    ID INT IDENTITY PRIMARY KEY,

    info_date SMALLDATETIME,

    mailBatch INT

    )

    INSERT INTO #LocalTable (info_date, mailBatch)

    SELECT

    sbp.info_Date,

    OCR.mailBatch

    FROM [Server_A].[DB1].[DBO].LocalTable

    SELECT

    11,

    tmp.AdjustedArrivalDate,

    #LocalTable.info_Date,

    datediff(hh, tmp.AdjustedArrivalDate, #LocalTable.info_Date) 'ActualTAT',

    32,

    (

    case

    when dateadd(hh, 32, tmp.AdjustedArrivalDate) > DATEADD(DAY, -[dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, #LocalTable.info_Date, 'True', 32), #LocalTable.info_Date)

    then 1

    when dateadd(hh, 32, tmp.AdjustedArrivalDate) < DATEADD(DAY, -[dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, #LocalTable.info_Date, 'True', 32), #LocalTable.info_Date)

    then 0

    else

    -1

    end

    ) as 'MetOrMissed',

    [dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, #LocalTable.info_Date, 'True', 32) 'OffsetDays'

    FROM #LocalTable

    Left join [DB3].[DBO].[MailTray_Batch_Xref] mx on #LocalTable.mailBatch = mx.BatchID

    Left join [DB4].[DBO].[T#TMP#HolidayLogicResults#AllSLAs] tmp on mx.MTID = tmp.UniqueIdentifier

    wheretmp.ArrivalDate between dateadd(day, datediff(day, 0, getdate()) - 120, 0)

    and dateadd(day, datediff(day, 0, getdate()), 0)

    and tmp.clientid = 4

    and tmp.SLAID = 11

    and tmp.SLATypeTable = 'T#DEV#SLACalculation#ElapsedTime'

    and tmp.HolidayLogicType = 'MailOpening'

    END

    GO

    Couldn't test this since I don't have any table definitions or anything, but play around with it, you should be able to get it to work.

  • kramaswamy (8/29/2011)


    Short of using replication, you have no option but to use a linked server - your data exists on a remote location, so you need to fetch it somehow. I mean, you could probably do something weird like, export the data to a file from your remote server, and then read the file on the local one, but it's not worth it.

    The reason why we're suggesting you do the work on the remote server is because it will be much faster than having to do a cross-server join. Your result set on the remote server will probably be much smaller, and querying it shouldn't be too problematic.

    What you can do is create a temp table on the local server, and then read the values from the temp table in the remote server into your local one.

    So, something like this:

    On the remote server, create a table to store the results of your remote server query. Then create this procedure:

    CREATE PROCEDURE [usp_SomeRemoteProcedureName]

    AS

    BEGIN

    -- SET NOCOUNT ON added to prevent extra result sets from

    -- interfering with SELECT statements.

    SET NOCOUNT ON;

    TRUNCATE TABLE LocalTable

    INSERT INTO LocalTable (info_date, mailBatch)

    SELECT

    sbp.info_Date,

    OCR.mailBatch

    FROM [Server_A].[DB1].[DBO].[SmartTuition_OCR_Document] OCR

    Left join [Server_A].[DB1].[DBO].[smartTuition_Info_ScanBatchProcessing] sbp on OCR.scanBatchNum = sbp.scanBatchNum

    and sbp.info_Date is not null

    AND sbp.Processed = 1

    END

    GO

    Then, on your local server, create this procedure:

    CREATE PROCEDURE [usp_SomeProcName]

    AS

    BEGIN

    -- SET NOCOUNT ON added to prevent extra result sets from

    -- interfering with SELECT statements.

    SET NOCOUNT ON;

    EXEC [Server_A].[DB1].[DBO].[usp_SomeRemoteProcedureName]

    CREATE TABLE #LocalTable

    (

    ID INT IDENTITY PRIMARY KEY,

    info_date SMALLDATETIME,

    mailBatch INT

    )

    INSERT INTO #LocalTable (info_date, mailBatch)

    SELECT

    sbp.info_Date,

    OCR.mailBatch

    FROM [Server_A].[DB1].[DBO].LocalTable

    SELECT

    11,

    tmp.AdjustedArrivalDate,

    #LocalTable.info_Date,

    datediff(hh, tmp.AdjustedArrivalDate, #LocalTable.info_Date) 'ActualTAT',

    32,

    (

    case

    when dateadd(hh, 32, tmp.AdjustedArrivalDate) > DATEADD(DAY, -[dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, #LocalTable.info_Date, 'True', 32), #LocalTable.info_Date)

    then 1

    when dateadd(hh, 32, tmp.AdjustedArrivalDate) < DATEADD(DAY, -[dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, #LocalTable.info_Date, 'True', 32), #LocalTable.info_Date)

    then 0

    else

    -1

    end

    ) as 'MetOrMissed',

    [dbo].[CalculateOffsetDaysSlaTat](13, tmp.AdjustedArrivalDate, #LocalTable.info_Date, 'True', 32) 'OffsetDays'

    FROM #LocalTable

    Left join [DB3].[DBO].[MailTray_Batch_Xref] mx on #LocalTable.mailBatch = mx.BatchID

    Left join [DB4].[DBO].[T#TMP#HolidayLogicResults#AllSLAs] tmp on mx.MTID = tmp.UniqueIdentifier

    wheretmp.ArrivalDate between dateadd(day, datediff(day, 0, getdate()) - 120, 0)

    and dateadd(day, datediff(day, 0, getdate()), 0)

    and tmp.clientid = 4

    and tmp.SLAID = 11

    and tmp.SLATypeTable = 'T#DEV#SLACalculation#ElapsedTime'

    and tmp.HolidayLogicType = 'MailOpening'

    END

    GO

    Couldn't test this since I don't have any table definitions or anything, but play around with it, you should be able to get it to work.

    Thank you for this wonderful explanation. Will post the changes and results tomorrow.

    Regards

    chandan

  • Actually looking over your queries again, I noticed something.

    For your queries against the remote server, you have a LEFT JOIN against smartTuition_Info_ScanBatchProcessing.

    However, later on in the query, you are saying: "and sbp.info_Date is not null"

    You could just remove that query clause, and change the join to be an INNER JOIN instead, unless I'm misunderstanding your queries. In that case your procedure on the remote server would change to:

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE PROCEDURE [usp_SomeRemoteProcedureName]

    AS

    BEGIN

    -- SET NOCOUNT ON added to prevent extra result sets from

    -- interfering with SELECT statements.

    SET NOCOUNT ON;

    TRUNCATE TABLE LocalTable

    INSERT INTO LocalTable (info_date, mailBatch)

    SELECT

    sbp.info_Date,

    OCR.mailBatch

    FROM [Server_A].[DB1].[DBO].[SmartTuition_OCR_Document] OCR

    JOIN [Server_A].[DB1].[DBO].[smartTuition_Info_ScanBatchProcessing] sbp on OCR.scanBatchNum = sbp.scanBatchNum AND sbp.Processed = 1

    END

    GO

  • Obviosuly, everything is SQL seems to be on a case-by-case basis. But, anytime you are using linked servers, you never know what the optimizer will do. I'm not sure if this is the case here, but when joining to linked servers, the optimizer will sometimes bring the entire table locally (tempdb) and then do it's joins. So, if you are dealing with large tables, linked servers can have major performance impacts. A lot of people see performance issues with linked servers and they aren't aware of the impact of doing cross-server joins, so just FYI..

    Hopefully, the sugestions provided by others will help you. If not, you might need to replicate or ETL your data to the local server.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply