Use cursor to run dynamic SQL to find special characters used in database

  • I need to find all uses of special characters in a database. I used the following code to do this:

    USE dbName

    GO

    IF OBJECT_ID('tempdb.dbo.#Results') IS NOT NULL DROP TABLE #Results

    GO

    CREATE TABLE #Results (

    TableName varchar(100) NULL

    ,ColumnName varchar(100) NULL

    ,PrimaryKeyColumn varchar(100) NULL

    ,ColumnValue varchar(8000) NULL

    ,PrimaryKeyValue varchar(8000) NULL

    ) ON [PRIMARY]

    GO

    DECLARE @CheckTables table

    (TableName varchar(100)

    ,ColumnName varchar(100)

    ,PrimaryKeyColumn varchar(100) )

    INSERT INTO @CheckTables

    SELECT t.name AS TableName, c.name AS ColumnName, B.COLUMN_NAME AS PrimaryKeyColumn

    FROM sys.columns c

    INNER JOIN sys.tables t ON c.object_id = t.object_id

    INNER JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS A ON t.name = A.TABLE_NAME

    INNER JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE B ON A.CONSTRAINT_NAME = B.CONSTRAINT_NAME

    WHERE c.system_type_id IN (35,167,231) -- text, varchar and nvarchar data types

    --AND t.[name] IN ('Table1', 'Table2', 'Table3')

    AND A.CONSTRAINT_TYPE = 'PRIMARY KEY'

    ORDER BY t.name, c.name

    DECLARE db_cursor CURSOR FOR

    SELECT TableName

    ,ColumnName

    ,PrimaryKeyColumn

    FROM @CheckTables

    DECLARE @TableName varchar(100),

    @ColumnName varchar(100),

    @PrimaryKeyColumn varchar(100),

    @sql nvarchar(1000);

    OPEN db_cursor

    FETCH NEXT FROM db_cursor INTO @TableName,@ColumnName,@PrimaryKeyColumn

    WHILE @@FETCH_STATUS = 0

    BEGIN

    BEGIN

    SET NOCOUNT ON;

    SELECT @sql = N'INSERT INTO #Results SELECT ''' + @TableName + ''', ''' + @ColumnName + ''', ''' + @PrimaryKeyColumn + ''', ' + @ColumnName + ', ' + @PrimaryKeyColumn + ' FROM ' + @TableName + ' WHERE ' + @ColumnName + ' LIKE ' + '''%[^ a-zA-Z0-9]%'''

    EXEC sp_executesql @sql

    END

    -- Fetch the next item

    FETCH NEXT FROM db_cursor INTO @TableName,@ColumnName,@PrimaryKeyColumn

    END

    CLOSE db_cursor

    DEALLOCATE db_cursor

    SELECT * FROM #Results ORDER BY TableName, ColumnName

    DROP TABLE #Results

    GO

    This will check all tables in the database, but if you want to check specific tables you can uncomment the line in the where clause and specify tables to be checked. The query will return any text fields that have any characters other than letters, numbers or spaces.

    This code works fine for me because all the tables in my database have single column primary keys. However I know how much Jeff Moden hates cursors or RBAR queries, so my question is could this have been done by any method other than using a cursor?

  • Nah. A cursor here is just fine. It's what they were meant to do. You could improve it a bit by turning it into a firehose cursor, though.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Cursoring through the tables is fine, but you should process all columns in a single table in one pass (scan), not a separate pass for each column.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher (4/17/2015)


    Cursoring through the tables is fine, but you should process all columns in a single table in one pass (scan), not a separate pass for each column.

    Agreed. iTVFs rule here.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks for the advice Jeff and Scott. I did not know what a firehose cursor was. From what I have read I think I just need to add FAST_FORWARD in the declaration statement, i.e. DECLARE db_cursor CURSOR FAST_FORWARD FOR...

    In terms of the Inline Table-Valued Function, I am not sure what that would look like. I will do some more reading on that.

    I also had to add square brackets around the table and column names when querying all tables in the database to prevent errors from tables that have reserved words such as BREAK.

  • Doesn't have to be an iTVF, you can still use dynamic SQL, just process all columns in a given table in one statement, not multiple.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Hi Scott. What would the SQL statement look like to process all the columns for a table in one statement? The database I am querying has 683 tables containing 5812 text or varchar type columns. The number of columns to check for each table ranges from 1 to 343.

    My query worked for 98% of the tables then just near the end it failed because one text field had more than 8000 characters. I realised I don't actually need to check notes type fields (datatype TEXT) so I removed that from the query, so it now only checks varchar and nvarchar types and runs succesfully. I excluded the drop statement for the temporary table so I can query the results. It took just under 8 minutes to run for the entire database and returned over 4 million rows. The code currently is the following:

    USE DBName

    GO

    IF OBJECT_ID('tempdb.dbo.#Results') IS NOT NULL DROP TABLE #Results

    GO

    CREATE TABLE #Results (

    TableName varchar(100) NULL

    ,ColumnName varchar(100) NULL

    ,PrimaryKeyColumn varchar(100) NULL

    ,ColumnValue varchar(8000) NULL

    ,PrimaryKeyValue varchar(8000) NULL

    ) ON [PRIMARY]

    GO

    DECLARE @CheckTables table

    (TableName varchar(100)

    ,ColumnName varchar(100)

    ,PrimaryKeyColumn varchar(100) )

    INSERT INTO @CheckTables

    SELECT t.name AS TableName, c.name AS ColumnName, B.COLUMN_NAME AS PrimaryKeyColumn

    FROM sys.columns c

    INNER JOIN sys.tables t ON c.object_id = t.object_id

    INNER JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS A ON t.name = A.TABLE_NAME

    INNER JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE B ON A.CONSTRAINT_NAME = B.CONSTRAINT_NAME

    WHERE c.system_type_id IN (167,231) -- varchar and nvarchar data types

    --AND t.[name] IN ('Table1', 'Table2', 'Table3','Table4')

    AND A.CONSTRAINT_TYPE = 'PRIMARY KEY'

    ORDER BY t.name, c.name

    DECLARE db_cursor CURSOR FAST_FORWARD FOR

    SELECT TableName

    ,ColumnName

    ,PrimaryKeyColumn

    FROM @CheckTables

    DECLARE @TableName varchar(100),

    @ColumnName varchar(100),

    @PrimaryKeyColumn varchar(100),

    @sql nvarchar(4000);

    OPEN db_cursor

    FETCH NEXT FROM db_cursor INTO @TableName,@ColumnName,@PrimaryKeyColumn

    WHILE @@FETCH_STATUS = 0

    BEGIN

    BEGIN

    -- SET NOCOUNT ON added to prevent extra result sets from

    -- interfering with SELECT statements.

    SET NOCOUNT ON;

    SELECT @sql = N'INSERT INTO #Results SELECT ''' + @TableName + ''', ''' + @ColumnName + ''', ''' + @PrimaryKeyColumn + ''', [' + @ColumnName + '], [' + @PrimaryKeyColumn + '] FROM [' + @TableName + '] WHERE [' + @ColumnName + '] LIKE ' + '''%[^ a-zA-Z0-9]%'''

    EXEC sp_executesql @sql

    END

    -- Fetch the next item

    FETCH NEXT FROM db_cursor INTO @TableName,@ColumnName,@PrimaryKeyColumn

    END

    CLOSE db_cursor

    DEALLOCATE db_cursor

    SELECT * FROM #Results ORDER BY TableName, ColumnName

    --DROP TABLE #Results

    GO

  • If the code ran in just 8 minutes, and you like it and results, it's logical for you to leave it as is.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Scott,

    I'm curious as well as to what the SQL would be to handle all columns at once.

    Been thinking about it -- not too hard -- and coming up with a bunch of ORs...

  • I used CASE rather than ORs ;-).

    SET NOCOUNT ON

    --Temp table DROP/CREATE, in alpha order--------------------------------------------------------------------------------

    IF OBJECT_ID('tempdb.dbo.#CheckTables') IS NOT NULL DROP TABLE #CheckTables

    IF OBJECT_ID('tempdb.dbo.#Results') IS NOT NULL DROP TABLE #Results

    CREATE TABLE #CheckTables (

    TableName varchar(100)

    ,ColumnName varchar(100)

    ,PrimaryKeyColumn varchar(100)

    ,PRIMARY KEY (TableName, ColumnName)

    )

    CREATE TABLE #Results (

    TableName varchar(100) NULL

    ,ColumnName varchar(100) NULL

    ,PrimaryKeyColumn varchar(100) NULL

    ,ColumnValue varchar(8000) NULL

    ,PrimaryKeyValue varchar(8000) NULL

    ) ON [PRIMARY]

    --Script-Control Variables----------------------------------------------------------------------------------------------

    DECLARE @PrintSql bit

    DECLARE @ExecSql bit

    DECLARE @PrintDebugInfo smallint

    SET @PrintSql = 0

    SET @ExecSql = 1

    SET @PrintDebugInfo = 0

    --Main Code-------------------------------------------------------------------------------------------------------------

    DECLARE @TableName varchar(100)

    DECLARE @ColumnName varchar(100)

    DECLARE @PrimaryKeyColumn varchar(100)

    DECLARE @TableName_Previous varchar(100)

    DECLARE @PrimaryKeyColumn_Previous varchar(100)

    DECLARE @sql varchar(max)

    DECLARE @Sql2 varchar(max)

    DECLARE @Sql3 varchar(max)

    INSERT INTO #CheckTables

    SELECT ca1.TableName, c.name AS ColumnName, INDEX_COL(ca1.TableName, i_pk.index_id, 1) AS PrimaryKeyColumn

    FROM sys.columns c

    CROSS APPLY (

    SELECT OBJECT_NAME(c.object_id) AS TableName

    ) AS ca1

    INNER JOIN sys.indexes i_pk ON

    i_pk.object_id = c.object_id AND

    i_pk.is_primary_key = 1

    WHERE

    c.system_type_id IN (167,231) -- varchar and nvarchar data types

    --AND ca1.TableName IN ('Table1', 'Table2', 'Table3','Table4')

    ORDER BY ca1.TableName, c.name

    IF @PrintDebugInfo > 0

    SELECT * FROM #CheckTables

    DECLARE table_cursor CURSOR LOCAL FAST_FORWARD FOR

    SELECT TableName, ColumnName, PrimaryKeyColumn

    FROM #CheckTables

    ORDER BY TableName, ColumnName

    OPEN table_cursor

    SET @TableName_Previous = '' --special value to skip processing the first time thru

    WHILE 1 = 1

    BEGIN

    --non-matching value to allow the previous table to complete processing before exiting

    SET @TableName = '~~'

    FETCH NEXT FROM table_cursor INTO @TableName, @ColumnName, @PrimaryKeyColumn

    IF @PrintDebugInfo >= 50

    PRINT 'Table=' + @TableName + '; Column=' + @ColumnName + '; PrimaryKeyColumn=' + @PrimaryKeyColumn + '.'

    IF @TableName <> @TableName_Previous

    BEGIN

    IF @PrintDebugInfo >= 10

    PRINT 'Entering table name change logic.'

    IF @TableName_Previous > ''

    BEGIN

    SET @sql =

    'INSERT INTO #Results ( TableName, ColumnName, PrimaryKeyColumn, ColumnValue, PrimaryKeyValue ) ' +

    'SELECT ' +

    '''$table$'' AS TableName, ' +

    'ct.ColumnName, ' +

    '''$PrimaryKey$'' AS PrimaryKeyColumn, ' +

    'CASE ct.ColumnName ' + @Sql2 + ' ELSE ''<no_match?>'' END AS ColumnValue, ' +

    'main.[$PrimaryKey$] AS PrimaryKeyValue ' +

    'FROM [$table$] AS main ' +

    'INNER JOIN #CheckTables ct ON ct.TableName = ''$table$'' ' +

    'WHERE 1 = CASE ct.ColumnName ' + @Sql3 + ' ELSE 0 END '

    SET @sql = REPLACE(REPLACE(@Sql,

    '$table$', @TableName_Previous),

    '$PrimaryKey$', @PrimaryKeyColumn_Previous)

    IF @PrintSql = 1

    PRINT @sql

    IF @ExecSql = 1

    EXEC(@Sql)

    END --IF

    SET @TableName_Previous = @TableName

    SET @PrimaryKeyColumn_Previous = @PrimaryKeyColumn

    SET @Sql2 = ''

    SET @Sql3 = ''

    END --IF

    IF @@FETCH_STATUS = -1

    BREAK

    IF @@FETCH_STATUS = -2

    CONTINUE

    SELECT @Sql2 = @Sql2 + REPLACE(' WHEN ''$column$'' THEN LEFT([$column$], 8000)', '$column$', @ColumnName)

    SELECT @Sql3 = @Sql3 + REPLACE(' WHEN ''$column$'' THEN

    CASE WHEN main.[$column$] LIKE ''%[^ a-zA-Z0-9]%'' THEN 1 ELSE 0 END', '$column$', @ColumnName)

    END --WHILE

    DEALLOCATE table_cursor

    SET NOCOUNT OFF

    SELECT *

    FROM #Results

    ORDER BY TableName, ColumnName

    DROP TABLE #CheckTables

    DROP TABLE #Results

    --End:Main Code---------------------------------------------------------------------------------------------------------

    GO

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Thank you Scott. I had to replace TableName with TblName becuase some audit tables had columns with that name, resulting in ambiguous column name errors. The query then ran successfully, although it took longer to run than my original query (10 min 52 sec vs. 7 min 41 sec).

  • ScottPletcher (4/20/2015)


    Doesn't have to be an iTVF, you can still use dynamic SQL, just process all columns in a given table in one statement, not multiple.

    True Dat! 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I prefer not to drop the results of the #Results temporary table at the end of the script (run it manually when finished with the data). Then you can query it and produce more user-friendly results for the end users. For example, copy all the primary key values for one table to a text editor, replace the carriage returns with commas, put brackets on each end and then you have the values to use for a WHERE clause (i.e. WHERE <PrimaryKeyColumn> IN ...

    You can then include other data from the same or related tables.

  • Of course you can build your query using the #Results table itself in a subquery or EXISTS clause rather than using a text editor. Just saying it maybe useful to keep for a while rather than deleting the table.

  • Thanks Scott,

    Very useful.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply