Bulk Copy - Problems with Data Merging

  • I am using BCP to import data from text files into a database table. The files contain quite distinct data but it seems at times that the data is being merged from both files into the database table. However,it seems as though this only happens when two files are being processed simultaneously. Is it possible that this is causing the merging? If a BCP process is running and then another is instantiated, is it possible that the merge is taking place due to this behavior? We are using BCP due to the volume of data and the speed but I starting to reconsider using DTS? Is DTS transactional? For example, would a DTS process run to completion before starting another? Any suggestions, ideas or comments would be greatly appreciated. I will provide further information should it be necessary.

  • Use a staging table while loading.

    If you are using bcp from command prompt it would be difficult to make sure when one process is running just not to start another.

    If you use Bulk Copy just check while not exists (Select * from StageTable) to intiate Bulk Copy so that you will not have another load when one load is running.

    Normally DTS needs the same loaction for an Input file (There are workarounds to have differenet files). So if DTS is initiated more than once before completion of the first same file will be loaded again and may cause merging. Or use same thing If not exists from stage table in DTS.

     

    Regards,
    gova

  • Thanks govinn. I am using a staging table to store the data from the bulk copy, and then I use the following to insert the data into the production table:

    CREATE procedure rt_ais_sp

    as

    SET ANSI_NULLS ON

    SET NOCOUNT OFF

    declare @utc varchar(10)

    declare @mid varchar(10)

    declare @callsign varchar(6)

    declare @tlat varchar(8)

    declare @tlng varchar(9)

    declare @vid varchar(6)

    exec delete_rt_ais

    declare @row varchar(400)

    declare e_cursor cursor for select data from rt_aisstaging

                            

    open e_cursor

    fetch next from e_cursor into @row

    WHILE(@@FETCH_STATUS <> -1)

    BEGIN

     IF (@@FETCH_STATUS <> -2)

      BEGIN

      

      if len(@row) <= 16

      BEGIN

      set @mid=substring(@row,1,10)

      set @callsign=substring(@row,11,6)

      -- Commented out on June 21, 2005 to diagnose issues with rt_ais update

      insert into rt_ais(mid,callsign)

      values (@mid,@callsign)

      END

      else

      BEGIN

      set @utc=substring(@row,5,10)

      set @vid=substring(@row,1,4)

      set @tlat=substring(@row,15,8)

      set @tlng=substring(@row,23,9)

          

      insert into rt_ais

      values (@mid,@callsign,@utc,@vid,@tlat,@tlng)

      

      --deletes real time ais data from rt_ais table if mid has changed on current update - June 17, 2005 12:44PM

      delete from rt_ais where callsign=@callsign and mid<>@mid

      

      END

      

      delete from rt_ais

      where mid=@mid and (vid is null or vid='')

       

      fetch next from e_cursor into @row

      END

    end

    close e_cursor

    deallocate e_cursor

    GO

    Could you expand more on how I could use "the check while not exists" that you are referring to above? Thanks.

  • To avaid loading one file until you clear or process the data in staging table you can use following statement. This will not be possible if you use bcp from command prompt. I assume after processing you don't need data. So in stored procedure add delete or truncate rt_aisstaging as last line.

    IF NOT EXISTS (SELECT * FROM rt_aisstaging)

    BEGIN

    BULK INSERT 'database_name'. 'owner'. { rt_aisstaging FROM 'data_file' }

        [ WITH

            (

                [ BATCHSIZE [ = batch_size ] ]

                [ [ , ] CHECK_CONSTRAINTS ]

                [ [ , ] CODEPAGE [ = 'ACP' | 'OEM' | 'RAW' | 'code_page' ] ]

                [ [ , ] DATAFILETYPE [ =

                    { 'char' | 'native'| 'widechar' | 'widenative' } ] ]

                [ [ , ] FIELDTERMINATOR [ = 'field_terminator' ] ]

                [ [ , ] FIRSTROW [ = first_row ] ]

                [ [ , ] FIRE_TRIGGERS ]

                [ [ , ] FORMATFILE = 'format_file_path' ]

                [ [ , ] KEEPIDENTITY ]

                [ [ , ] KEEPNULLS ]

                [ [ , ] KILOBYTES_PER_BATCH [ = kilobytes_per_batch ] ]

                [ [ , ] LASTROW [ = last_row ] ]

                [ [ , ] MAXERRORS [ = max_errors ] ]

                [ [ , ] ORDER ( { column [ ASC | DESC ] } [ ,...n ] ) ]

                [ [ , ] ROWS_PER_BATCH [ = rows_per_batch ] ]

                [ [ , ] ROWTERMINATOR [ = 'row_terminator' ] ]

                [ [ , ] TABLOCK ]

            )

        ]

    END

    Regards,
    gova

  • HI,

    My problem is a little different. I had this "Force Protocol Enable" checked. after that, the BCP "Query out" seems to be failing. But BCP "OUT" works. I tried on my local server, but the reverse happens irrespective of the option in "Force protocol enable". I am not sure if its is because of this but dont know the exact cause. This is a bit urgent and any help in this regards is appreciated. thanks so much...

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply