dealing with Bulk load data conversion error

  • Hi 
     we have process that load formatted log files into table

     file size 200-300 MB and contain 3000000-4000000 rows

     some of the rows  in each file not correctly formatted and failing load operation 

     I wanted load all VALID rows and sent notification to team if some of rows not loaded 

    create procedure loadmyfiles
    as
    DECLARE @FileToLoad         varchar(100)
    DECLARE  @ErrorLogPath varchar(50)

    DECLARE @sql            varchar(2000)
    DECLARE @UniqueErrorLogFile varchar(50)    

    set @FileToLoad = 'D:\servervisits\FILE_TO_LOAD\servervisits_logs.csv'
    set @ErrorLogPath = 'D:\servervisits\FILE_TO_LOAD_ERRORS'

    --path and name of the errorlog file.Records rejected during Bulk Insert will be saved in this error log file                                                                    
    SET    @UniqueErrorLogFile = @ErrorLogPath +'\'+REPLACE(REPLACE(REPLACE(CONVERT(varchar,GETDATE(), 20),'-',''),':',''),' ','') +'.log'
        
    SELECT @sql = '
        Truncate table [proxy].[servervisits_logs_tmp]

        BULK INSERT [proxy].[servervisits_logs_tmp] FROM ''' + @FileToLoad + '''
                WITH (    FIRSTROW = 1,
                    MAXERRORS = 10,
                    FIELDTERMINATOR = ''^'', ROWTERMINATOR = ''0x0a'',
                    ERRORFILE =''' +@UniqueErrorLogFile + '''
                    ) '

    --EXECUTE(@SQL)

    begin try
        EXECUTE(@SQL)
    end try
    begin catch
    EXEC msdb.dbo.sp_send_dbmail @recipients='team@mydomain.com',
      @subject = 'not all records loaded ',
      @body = 'not all records loaded '

    end catch

    if I run it with  try catch  and some records not loaded , procedure run successfully and does not go to catch block

    if I run procedure  WITHOUT   try catch  and some records not loaded , procedure  load valid rows but return error
    Msg 4864, Level 16, State 1, Line 4
    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1855228, column 5 (blocked).

    since I am placing this procedure to run daily by SQL agent   I need somehow avoid  "job failure" notification and simply dispatch message to dev team saying file is loaded but some records not formatted

  • load the file onto a table with a single column (varchar(max) - or whatever size is enough to hold a single record)
    and then do parsing and validation in t-sql - that way you can process the good ones and send the others for cleaning.

  • frederico_fonseca - Friday, November 16, 2018 11:15 AM

    load the file onto a table with a single column (varchar(max) - or whatever size is enough to hold a single record)
    and then do parsing and validation in t-sql - that way you can process the good ones and send the others for cleaning.

    "load the file onto a table with a single column"
    I want to keep this as last option

    Looking more for solution of dealing with output of EXECUTE(@SQL)
    file size might easily grow over the time, number of files generated daily too
    and ERRORFILE create 2 files one with not inserted rows   and another with error details for each row

  • use the try catch block.
    after the try catch do another bulk insert this time of the error file - if any record is loaded you had errors on the original bulk insert and you can send and email. Including, if required/feasible, the rows in error.

  • frederico_fonseca - Friday, November 16, 2018 12:07 PM

    use the try catch block.
    after the try catch do another bulk insert this time of the error file - if any record is loaded you had errors on the original bulk insert and you can send and email. Including, if required/feasible, the rows in error.

    perfect ! it is very elegant solution

  • I do use the error sequestration functionality of BULK INSERT but I hate the cryptic messages it stores.  I've never understood why the messages can't be more like the ones you see on the screen when running the code from things like SSMS.

    To overcome that problem (and I know a lot of people that think xp_CmdShell is a security issue because they don't know how to use it without it actually being a security issue), I call xp_CmdShell to call SQLCMD to run the BULK INSERT command.  I capture the OUTPUT from the xp_CmdShell call and parse that for things like rowcounts, execution time, and those wonderfully human-friendly error messages.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply