'R' script to remove non-matching characters

  • I'm trying to produce an 'R' script to remove non-matching characters from the input:

    CREATE TABLE [mydata] ([data] [varchar](25))

    INSERT INTO [mydata] VALUES
    ('Lose the question mark?')
    GO

    EXEC [sys].[sp_execute_external_script]
    @language=N'R',
    @script=N'
    outData <- subset(inData, sub("[^\x21,\x23-\x26,\x28-\x2b,\x2d-\x3e,\x40-\x5b,\x5d-\x5f,\x61-\x7e]", "", data))',
    @input_data_1 = N'SELECT [data] FROM [mydata]',
    @input_data_1_name = N'inData',
    @output_data_1_name = N'outData'
    WITH
    RESULT SETS (AS OBJECT [mydata]);
    GO

    DROP TABLE [mydata]
    GO

    However the script is returning:

    (1 row affected)
    Msg 39004, Level 16, State 20, Line 6
    A 'R' script error occurred during execution of 'sp_execute_external_script' with HRESULT 0x80004004.
    Msg 39019, Level 16, State 2, Line 6
    An external script error occurred:
    Error in subset.data.frame(inData, sub("[^!,#-&,(-+,-->,@-[,]-_,a-~]", :
    'subset' must be logical
    Calls: source ... withVisible -> eval -> eval -> subset -> subset.data.frame

    Error in execution. Check the output for more information.
    Error in eval(ei, envir) :
    Error in execution. Check the output for more information.
    Calls: runScriptFile -> source -> withVisible -> eval -> eval -> .Call
    Execution halted

    Note that the REGEX string works fine in my SSIS Script Component but I need this functionality abstracted into the T-SQL command.

    Any ideas how to get this functionality?

    • This topic was modified 3 years, 7 months ago by  Pete Bishop. Reason: Further detail added
  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply