Why use cdc.fn_cdc_get_all_changes_?

  • If you look in the cdc.fn_get_all_changes_<capture instance> functions, you'll see something like:

    create function [cdc].[fn_cdc_get_all_changes_dbo_myCaptureInstance]
    (@from_lsn binary(10),
    @to_lsn binary(10),
    @row_filter_option nvarchar(30)
    )
    returns table
    return

    select NULL as __$start_lsn,
    NULL as __$seqval,
    NULL as __$operation,
    NULL as __$update_mask, NULL as [myColumn1], NULL as [myColumn2], NULL as [etc]
    where ( [sys].[fn_cdc_check_parameters]( N'dbo_myCaptureInstance', @from_lsn, @to_lsn, lower(rtrim(ltrim(@row_filter_option))), 0) = 0)

    union all

    select t.__$start_lsn as __$start_lsn,
    t.__$seqval as __$seqval,
    t.__$operation as __$operation,
    t.__$update_mask as __$update_mask, t.[myColumn1], t.[myColumn2], t.[etc]
    from [cdc].[dbo_myCaptureInstance_CT] t with (nolock)
    where (lower(rtrim(ltrim(@row_filter_option))) = 'all')
    and ( [sys].[fn_cdc_check_parameters]( N'dbo_myCaptureInstance', @from_lsn, @to_lsn, lower(rtrim(ltrim(@row_filter_option))), 0) = 1)
    and (t.__$operation = 1 or t.__$operation = 2 or t.__$operation = 4)
    and (t.__$start_lsn <= @to_lsn)
    and (t.__$start_lsn >= @from_lsn)

    union all

    select t.__$start_lsn as __$start_lsn,
    t.__$seqval as __$seqval,
    t.__$operation as __$operation,
    t.__$update_mask as __$update_mask, t.[myColumn1], t.[myColumn2], t.[etc]
    from [cdc].[dbo_myCaptureInstance_CT] t with (nolock)
    where (lower(rtrim(ltrim(@row_filter_option))) = 'all update old')
    and ( [sys].[fn_cdc_check_parameters]( N'dbo_myCaptureInstance', @from_lsn, @to_lsn, lower(rtrim(ltrim(@row_filter_option))), 0) = 1)
    and (t.__$operation = 1 or t.__$operation = 2 or t.__$operation = 4 or
    t.__$operation = 3 )
    and (t.__$start_lsn <= @to_lsn)
    and (t.__$start_lsn >= @from_lsn)

    There is a lot of stuff in there that I don't necessarily want to execute on every row in my resultset, and parts of the UNION I know I will not be accessing. Microsoft recommends accessing CDC information using these functions as opposed to querying the underlying CDC tables directly:

    "We recommend that you do not query the system tables directly. Instead, execute the cdc.fn_cdc_get_all_changes_<capture_instance> and cdc.fn_cdc_get_net_changes_<capture_instance> functions."

    When using these table-valued function, the system function sys.fn_cdc_check_parameters is executed for every row returned even though its input parameters are static. In turn, these functions executes several other authenticating functions such as: sys.fn_cdc_select_accesssys.fn_cd_is_range_valid, sys.fn_cdc_get_role, sys.fn_cdc_get_source & sys.fn_cdc_get_column_list.

    When there are a lot of changes recorded, these [subsequent] function can represent a lot of additional wait time for the query.

    I have a stored procedure which joins a table (User) to the cdc function. A simplified version currently looks [something] like this:

    SELECT
    U.UserID,
    CU.__$update_mask
    FROM
    [User] U
    INNER JOIN cdc.fn_cdc_get_all_changes_dbo_user(@from_lsn, @to_lsn, N'all') CU ON (CU.UserID = U.UserID)
    WHERE
    CU.__$operation = 4 /* Updates Operation */

    My question is: Other than the fact the MS recommends against it, why shouldn't I join to the cdc.<capture_instance>_CT tables directly in my stored procedure which will allow my to avoid all the additional costs of the sys.fn_cdc_check_parameters function on every row? Something like:

    1. SELECT
      U.UserID,
      T.__$update_mask
      FROM
      [User] U
      INNER JOIN [cdc].[dbo_User_CT] T WITH (NOLOCK)
      WHERE
      T.__$operation = 4 /* Updates Operation */
      AND T.__$start_lsn BETWEEN @from_lsn AND @to_lsn

       

     

    As a side, question: While investigating this issue I was looking at the source for master.sys.fn_cdc_check_parameters. In it, there is a query that looks like this:

    SELECT *  
    FROM cdc.[fn_cdc_get_net_changes_...](0X00, 0X01, 'all')

    What does the ellipse mean (...)? I've scene this sort of syntax when specifying any database, any schema, as a string, with OBJECT_ID, but not when specifying a specific table/function.

  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • A follow-up to my original post:

    After some more testing, I was able to confirm the SQL optimizer is smart enough not to evaluate a function with static input parameters more than once, working with the underlying tables ([cdc].[dbo_myCaptureInstance_CT]) instead of the table-values function ([cdc].[fn_cdc_get_all_changes_dbo_myCaptureInstance]) has no performance advantage.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply