Duplicate database, same view, significant performance difference

  • Hi,

    I have a troubling problem with the following scenario: a view takes a few seconds to execute on one table but takes minutes to execute against a duplicate table in a duplicate database (with less data).

    We have a database SSISAdmin which is used for custom logging of SSIS events, errors, and statistics.

    There is a copy of the database in the same sql server instance called SSISAdmin_Dev (actually the production one waws copied from the development one).

    Databases:

    SSISAdmin (production database)

    SSISAdmin_Dev (development database)

    In each database we have exactly the same tables, views and indexes.

    There are two tables in each database:

    SSIS_ExecutionLog

    SSIS_ExecutionLog_History

    The first table contains the most recent execution information for a given 'package group' (simply, a package and its children log entries under the same name). Any execution of a package group first flushes the previous entries into the history table before adding its own entries. Each individual package logs an entry in the database when it starts, and when it ends (both records have the same Execution ID). In this way a kind of hierarchy is formed.

    We have a numder of views as follows:

    vw_F_ExecutionLog

    vw_F_ExecutionLogPackageHierarchy

    vw_F_ExecutionLogMerge

    The first view simply combines the records from both tables via a UNION ALL.

    The second view operates on the first view using a recursive CTE to add hierarchy level information to each record.

    The third view operates on the second and attempts to combine the start and end records into single records per execution.

    As I mentioned, the same tables and views are duplicated in each database.

    Executing the views against the dev database (SSISAdmin_Dev) has the following results:

    View#1 takes 2 seconds, returns 100,000 rows

    View#2 takes 6 seconds, returns 100,000 rows

    View#3 takes 8 seconds, returns about 64,000 rows

    Executing the views against the prod database (SSISAdmin) has the following results:

    View#1 takes 0 seconds, returns 40,000 rows (prod hasn't been active for as long as dev).

    View#2 takes 1 second, returns 40,000 rows

    View#3 takes 4 minutes and 22 seconds, returns about 22,000 rows

    That's the problem: There's no reason for it to take more than a few seconds (that I can see).

    One thing we tried was adding a maintenance plan for the two databases, just set to rebuild indexes and update statistics, but after running the plan View 3 took a bit longer on the prod database and about half an hour on the dev database!

    After that we tried this in SSMS:

    SSIS_ExecutionLog_History --> Indexes --> Right-click --> Rebuild All

    This had the effect of returning View 3 execution times to the prior 8 sec for dev and 4 mins for prod.

    Since it involves the same tables, with the same indexes, with the same data (slightly different volumes), with same views... I don't know what it could be.

    I'm attaching the table definitions, index definitions, view definitions, execution plan for view 3 against dev, and execution plan for view 3 against prod, (what else am I missing?) in a zip file.

  • Looking at the two execution plans there are two key lookups on each plan that take approx 36% each.

    In the prod one the number of executions is 1677 and in dev 39344.

    Which for me I would look at comparing index fragmetation between the two databases on the key:

    SSIS_Execution_log..PK_SSIS_ExecutionLog_History.

    Possibly rebuild the index on dev and update the statistics.

    MCITP SQL 2005, MCSA SQL 2012

  • I am missing an important piece of info: are those two databases on the same iron and the same Windows?

  • From what I saw, the execution plan is totally different between the production and development. The dev is doing a CL seek where as the prod is doing a non cl seek with Key look up.

    After the update of statistics and recreating the index did you recreate the view in production? Can you try that please?

    -Roy

  • Thanks for the replies,

    Both databases are on the same instance which is on a single virtual server hosted by a 3rd party, so I can't be 100% sure that both are on the same iron, but it would be unlikely right? Each database has its own mdf, and each table is stored in its primary filegroup. Both files are in the same folder on the D drive.

    I'll try the other suggestions when I get into work in an hour or so.

    Thanks,

    Sam

  • I've just tried the following on production:

    1. Drop all indexes and statistics on SSIS_ExecutionLog and SSIS_ExecutionLog_History tables.

    2. Execute view [vw_F_ExecutionLogMerge]: returned 22000 rows in 5 minutes 50 seconds. Execution plan shows table scans across the board.

    3. Update statistics on SSIS_ExecutionLog and SSIS_ExecutionLog_History tables.

    Command:

    USE [SSISAdmin]

    GO

    UPDATE STATISTICS [dbo].[SSIS_ExecutionLog_History]

    WITH FULLSCAN

    GO

    USE [SSISAdmin]

    GO

    UPDATE STATISTICS [dbo].[SSIS_ExecutionLog]

    WITH FULLSCAN

    GO

    Results in creation of statistics:

    dbo.SSIS_ExecutionLog

    -> Statistics

    --> _WA_Sys_00000001_398D8EEE

    --> _WA_Sys_00000002_398D8EEE

    dbo.SSIS_ExecutionLog_History

    -> Statistics

    --> _WA_Sys_00000001_3B75D760

    --> _WA_Sys_00000002_3B75D760

    4. Execute view [vw_F_ExecutionLogMerge]: returned 22000 rows in 5 minutes 52 seconds. Execution plan shows table scans across the board.

    5. Drop statistics

    6. Drop the 3 views

    7. Recreate indexes

    Results in creation of indexes and statistics:

    dbo.SSIS_ExecutionLog

    -> Indexes

    --->ExecutionID_Index

    --->HierarchyKeys

    --->LogStatus

    --->PK_SSIS_ExecutionLog_1

    -> Statistics

    --->ExecutionID_Index

    --->HierarchyKeys

    --->LogStatus

    --->PK_SSIS_ExecutionLog_1

    dbo.SSIS_ExecutionLog_History

    -> Indexes

    --->ExecutionID_Index

    --->HierarchyKeys

    --->LogStatus

    --->PK_SSIS_ExecutionLog_History

    -> Statistics

    --->ExecutionID_Index

    --->HierarchyKeys

    --->LogStatus

    --->PK_SSIS_ExecutionLog_History

    8. Execute view [vw_F_ExecutionLogMerge]: returned 22000 rows in 4 minutes 24 seconds. Execution plan shows Index seek on HierarchyKeys and Key lookups against the PK, across the board.

    9. Drop the 3 views

    8. Update statistics - same command as above

    No new statistics created

    9.Recreate the 3 views

    10. Execute view [vw_F_ExecutionLogMerge]: returned 22000 rows in 4 minutes 39 seconds. Execution plan is the same as step 8.

  • In other news, applying steps 1 and 2 above to the development database results in 64,000 rows returned after 9 seconds. It also creates a couple of statistics against each database but I didn't notice it doing that on production until after I ran the update statistics commands.

    So if the unindexed, unstatisticsed performance is still so vastly difference it must be something else?

  • In a further test on production,

    changing the Join in View#3 from a LEFT OUTER JOIN to an INNER JOIN results in 14983 rows returning in 3 seconds.

    In addition the execution plan changes to use a Merge Join right before the final Select. Previously it was doing a Nested Loop.

    So I think that's the difference, but I don't understand why.

    Why would execution plan vary between the two databases?

    Could it be something in the data?

    Is there a query hint to force it?

  • I've made another change, this time to View#1 (a standard restricted self join to eliminate 'duplicates').

    Prior to this change it was possible to have what are effectively duplicate records on one side of the join in View#3.

    In theory, now it should always be either 1 or 0 matches in the join.

    The good news is that View#3 now executes in less than 10 seconds in production. The execution plan is now automatically using the Merge Join rather than Nested Loop.

    Duplicates are present in both production and development so I don't see why this would adversely affect production and not dev.

    Just as a note:

    The 'duplicates' are caused when an SSIS execution calls the OnError event multiple times for a single error. Resulting in an execution entry, with a slightly different timestamp but otherwise the same, for each call (Unfortunately, it's too late to change this behaviour).

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply