Batch Processing in SSIS

  • Post the DDL for the source table including the key and index definitions.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Hi,

    Here is the DDL of the table.

    CREATE TABLE [dbo].[t_adr_rpt_doc](

    [rpt_id] [bigint] NULL,

    [rpt_doc_tx] [varbinary](max) NULL,

    [lst_upd_ts] [datetime] NOT NULL,

    [lst_upd_user_id] [char](8) NOT NULL

    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

    Index:

    CREATE CLUSTERED INDEX [idx_adr_rpt_doc_01] ON [dbo].[t_adr_rpt_doc]

    (

    [rpt_id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    Thanks

  • You mention in your first post " process 500 records." I'd lile to know more what you mean by process.Are you transforming the data in the foreach loop? There is not enough information on here to give a sure answer to your question. Is all the data on one server? Does the source data have primary keys? Will the data in the destination ever need updating from the source (in this case the isTransferred bit field would not server your purpose)? So you see where I am going with this? We need to know what you have in detail and what is the business objective you are trying to accomplish. In the interim I can offer a suggestion to use a staging table to move the raw data and then have a MERGE command used in a stored procedure. Why only 500 rows at a time?

    ----------------------------------------------------

  • preethi.kaliyamoorthy (5/14/2015)


    Hi,

    Here is the DDL of the table.

    CREATE TABLE [dbo].[t_adr_rpt_doc](

    [rpt_id] [bigint] NULL,

    [rpt_doc_tx] [varbinary](max) NULL,

    [lst_upd_ts] [datetime] NOT NULL,

    [lst_upd_user_id] [char](8) NOT NULL

    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

    Index:

    CREATE CLUSTERED INDEX [idx_adr_rpt_doc_01] ON [dbo].[t_adr_rpt_doc]

    (

    [rpt_id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    Thanks

    Unfortunately there are no unique keys defined and the best candidate based on the names and types is rpt_id, but that accepts NULL.

    What do these return?

    select count(*) from [dbo].[t_adr_rpt_doc] where rpt_id is not null;

    select count(*) from [dbo].[t_adr_rpt_doc] where rpt_id is null;

    select count(distinct rpt_id) from [dbo].[t_adr_rpt_doc];

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Hi Orlando,

    rpt_id doesnt contain any NULL value.

    select count(*) from [dbo].[t_adr_rpt_doc] where rpt_id is not null;

    count:396004

    select count(*) from [dbo].[t_adr_rpt_doc] where rpt_id is null;

    count:0

    select count(distinct rpt_id) from [dbo].[t_adr_rpt_doc];

    count:396004

    Thanks

  • The database will not enforce it but your data supports the ideas of using rpt_id as a unique key.

    So, just keep track of the rpt_id values you have already processed and skip those rows reach time you run your import job.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Hi MMartin1,

    The first post is not by me.

    I will explain it in detail. I have this t_adr_rpt_doc table source table which has no primary key but a index on rpt_id. This table contains BLOB data in the column rpt_doc_tx.

    I need to load this table's data to another sql server table. Both the tables are in different servers. My dataflow task has a source and destination. Thats it. It is only loading some 40,000 records and after that getting buffer error. I searched in google and tried to increase DefaultBufferSize and decrease DefaultBufferMaxRows. But I got following error everytime:

    "Information: The buffer manager detected that the system was low on virtual memory, but was unable to swap out any buffers. 4 buffers were considered and 4 were locked. Either not enough memory is available to the pipeline because not enough is installed, other processes are using it, or too many buffers are locked. "

    And I came across this post. Since some 40,000 records are getting transferred I thought I could do it in a ForLoop and process 20,000 rows everytime. But I shouldnt delete the records from my source, since this has to happen everyday i.e loading from source to destination table. So only I asked how to mark the rows as transferred.

    If you have any idea, please help me. Ask me if you want more details.

    Thanks

Viewing 7 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply