November 14, 2016 at 1:57 am
Hi,
How can I find all the rows of a database that have a particular pattern: for example: (INC000007615432) or with other words:
Find all columns in a Database, where Column-Properties:
* Len (15) = char (15)
* Starts with "INC"
* After INC following 12 num digits
I have over 100 tables with many columns, and i can not take a look in every table. No Documentation, Columns have diffrent Names 🙁
Regards
Nicole 😉
November 14, 2016 at 2:06 am
info 58414 (11/14/2016)
Hi,How can I find all the rows of a database that have a particular pattern: for example: (INC000007615432) or with other words:
Find all columns in a Database, where Column-Properties:
* Len (15) = char (15)
* Starts with "INC"
* After INC following 12 num digits
I have over 100 tables with many columns, and i can not take a look in every table. No Documentation, Columns have diffrent Names 🙁
Regards
Nicole 😉
Quick questions, do all the columns you are looking for have the same data type? Is the data consistent within those columns, that is will all rows in those columns be of the same pattern?
😎
November 14, 2016 at 2:14 am
Yes, the Column-Value has the same Pattern,
No, the Column Datatype can be diffrent
November 14, 2016 at 2:17 am
Nicole
Those aren't properties of the column: they're properties of the individual values in the columns. Are you saying that you want to find all columns that have at least one value that satisfies those conditions?
John
November 14, 2016 at 2:45 am
No, all Columns have the same Pattern:
Example: INC000001234567
* Len (15) = char (15)
* Starts with "INC"
* After INC following 12 num digits
November 14, 2016 at 3:12 am
OK, this query will look at all columns with data length of 15 or more and show the number of values with the requested pattern. Beware that the INFORMATION_SCHEMA views have been known not to report SCHEMA_NAME correctly; if that's a problem for you, use sys.columns instead. Please remove the spaces that I had to put in in order to get this post through my company's web sweeper.
DEC LARE @SQL nvarchar(max)
SELECT @SQL = STUFF(
(SELECT
'UNI ON ALL SELECT ''' + TABLE_SCHEMA + ''', ''' + TABLE_NAME + ''', ''' + COLUMN_NAME + ''', COUNT(*) FROM ' + TABLE_SCHEMA + '.' + TABLE_NAME
+ ' WHERE ' + COLUMN_NAME + ' LIKE ''INC[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'' '
FROM INFORMATION_SCHEMA.COLUMNS
WHERE DATA_TYPE LIKE '%char'
OR DATA_TYPE = 'text'
AND COALESCE(CHARACTER_MAXIMUM_LENGTH,100) >= 16
OR CHARACTER_MAXIMUM_LENGTH = -1 -- "max" data types
FOR XML PATH (''))
,1
,10 -- remove the first UN ION ALL
,''
)
EXEC sys.sp_executesql @statement = @SQL
John
November 14, 2016 at 4:10 am
John Mitchell-245523 (11/14/2016)
OK, this query will look at all columns with data length of 15 or more and show the number of values with the requested pattern. Beware that the INFORMATION_SCHEMA views have been known not to report SCHEMA_NAME correctly; if that's a problem for you, use sys.columns instead. Please remove the spaces that I had to put in in order to get this post through my company's web sweeper.
DEC LARE @SQL nvarchar(max)
SELECT @SQL = STUFF(
(SELECT
'UNI ON ALL SELECT ''' + TABLE_SCHEMA + ''', ''' + TABLE_NAME + ''', ''' + COLUMN_NAME + ''', COUNT(*) FROM ' + TABLE_SCHEMA + '.' + TABLE_NAME
+ ' WHERE ' + COLUMN_NAME + ' LIKE ''INC[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'' '
FROM INFORMATION_SCHEMA.COLUMNS
WHERE DATA_TYPE LIKE '%char'
OR DATA_TYPE = 'text'
AND COALESCE(CHARACTER_MAXIMUM_LENGTH,100) >= 16
OR CHARACTER_MAXIMUM_LENGTH = -1 -- "max" data types
FOR XML PATH (''))
,1
,10 -- remove the first UN ION ALL
,''
)
EXEC sys.sp_executesql @statement = @SQL
John
Careful, this query will fail if there are any XML columns in the database.
😎
November 14, 2016 at 4:13 am
Here is a query I often use, adjusted to this requirements
😎
USE TEEST;
GO
SET NOCOUNT ON;
DECLARE @PATTERN NVARCHAR(MAX) = N'INC[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]';
DECLARE @SEARCH_TEMPLATE NVARCHAR(MAX) = N'
UNION ALL
SELECT
''{{@SCHEMA_NAME}}.{{@TABLE_NAME}}'' AS TABLE_NAME
,''{{@COLUMN_NAME}}'' AS COLUMN_NAME
,''{{@DATA_TYPE}}({{@MAX_LENGTH}})'' AS DATA_TYPE
WHERE EXISTS (
SELECT *
FROM {{@SCHEMA_NAME}}.{{@TABLE_NAME}} X
WHERE X.{{@COLUMN_NAME}} LIKE ''{{@PATTERN}}''
)
';
DECLARE @SEARCH_SQL NVARCHAR(MAX) =
STUFF((
SELECT
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(@SEARCH_TEMPLATE,N'{{@SCHEMA_NAME}}',QUOTENAME(OBJECT_SCHEMA_NAME(STAB.object_id)))
,N'{{@TABLE_NAME}}' ,QUOTENAME(STAB.name))
,N'{{@COLUMN_NAME}}',QUOTENAME(SCOL.name))
,N'{{@DATA_TYPE}}' ,STYP.name)
,N'{{@PATTERN}}' ,@PATTERN)
,N'{{@MAX_LENGTH}}' ,CASE
WHEN SCOL.max_length > 0 THEN CONVERT(NVARCHAR(12),SCOL.max_length,0)
ELSE N'MAX'
END)
FROM sys.tables STAB
INNER JOIN sys.columns SCOL
INNER JOIN sys.types STYP
ON SCOL.system_type_id = STYP.system_type_id
ON STAB.object_id = SCOL.object_id
WHERE STAB.type_desc = N'USER_TABLE'
AND STYP.name LIKE N'%CHAR'
AND SCOL.max_length NOT BETWEEN 0 AND 15
FOR XML PATH(''),TYPE).value('(./text())[1]','NVARCHAR(MAX)'),1,11,N'');
EXEC (@SEARCH_SQL);
November 14, 2016 at 4:24 am
Eirikur Eiriksson (11/14/2016)
John Mitchell-245523 (11/14/2016)
OK, this query will look at all columns with data length of 15 or more and show the number of values with the requested pattern. Beware that the INFORMATION_SCHEMA views have been known not to report SCHEMA_NAME correctly; if that's a problem for you, use sys.columns instead. Please remove the spaces that I had to put in in order to get this post through my company's web sweeper.
DEC LARE @SQL nvarchar(max)
SELECT @SQL = STUFF(
(SELECT
'UNI ON ALL SELECT ''' + TABLE_SCHEMA + ''', ''' + TABLE_NAME + ''', ''' + COLUMN_NAME + ''', COUNT(*) FROM ' + TABLE_SCHEMA + '.' + TABLE_NAME
+ ' WHERE ' + COLUMN_NAME + ' LIKE ''INC[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'' '
FROM INFORMATION_SCHEMA.COLUMNS
WHERE DATA_TYPE LIKE '%char'
OR DATA_TYPE = 'text'
AND COALESCE(CHARACTER_MAXIMUM_LENGTH,100) >= 16
OR CHARACTER_MAXIMUM_LENGTH = -1 -- "max" data types
FOR XML PATH (''))
,1
,10 -- remove the first UN ION ALL
,''
)
EXEC sys.sp_executesql @statement = @SQL
John
Careful, this query will fail if there are any XML columns in the database.
😎
Surely not? It's only looking at %char and text columns. The worst that could happen is that there's an XML column that has plain (non-XML) data in it and that column gets overlooked, isn't it?
John
November 14, 2016 at 4:29 am
John Mitchell-245523 (11/14/2016)
Eirikur Eiriksson (11/14/2016)
John Mitchell-245523 (11/14/2016)
OK, this query will look at all columns with data length of 15 or more and show the number of values with the requested pattern. Beware that the INFORMATION_SCHEMA views have been known not to report SCHEMA_NAME correctly; if that's a problem for you, use sys.columns instead. Please remove the spaces that I had to put in in order to get this post through my company's web sweeper.
DEC LARE @SQL nvarchar(max)
SELECT @SQL = STUFF(
(SELECT
'UNI ON ALL SELECT ''' + TABLE_SCHEMA + ''', ''' + TABLE_NAME + ''', ''' + COLUMN_NAME + ''', COUNT(*) FROM ' + TABLE_SCHEMA + '.' + TABLE_NAME
+ ' WHERE ' + COLUMN_NAME + ' LIKE ''INC[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'' '
FROM INFORMATION_SCHEMA.COLUMNS
WHERE DATA_TYPE LIKE '%char'
OR DATA_TYPE = 'text'
AND COALESCE(CHARACTER_MAXIMUM_LENGTH,100) >= 16
OR CHARACTER_MAXIMUM_LENGTH = -1 -- "max" data types
FOR XML PATH (''))
,1
,10 -- remove the first UN ION ALL
,''
)
EXEC sys.sp_executesql @statement = @SQL
John
Careful, this query will fail if there are any XML columns in the database.
😎
Surely not? It's only looking at %char and text columns. The worst that could happen is that there's an XML column that has plain (non-XML) data in it and that column gets overlooked, isn't it?
John
It will crap out as XML is incompatible with the LEN function. Easily fixed by adding parenthesis to the length portion of the where clause.
😎
WHERE DATA_TYPE LIKE '%char'
OR DATA_TYPE = 'text'
AND (COALESCE(CHARACTER_MAXIMUM_LENGTH,100) >= 16
OR CHARACTER_MAXIMUM_LENGTH = -1)
November 14, 2016 at 4:39 am
Oh, I see, yes - having my ANDs and ORs in a muddle caused XML columns to be included. Good spot!
John
November 14, 2016 at 4:44 am
Another thing to pay attention to is the cost of using COUNT, it implies that the whole table must be scanned and aggregated, much heavier on the database than using EXISTS, here are the aggregated stats from a small test database.
😎
Table Scan Count Logical Reads Physical Reads Read-Ahead Reads LOB Logical Reads LOB Physical Reads LOB Read-Ahead Reads
COUNT 280 508,963 0 0 45,000 0 0
EXISTS 196 68,415 0 0 45,000 0 0
November 14, 2016 at 4:56 am
Yes, although I assume this is a one-off and therefore performance might not be too important. Also, using EXISTS might not meet the requirement. Since we know that all values in the column will have the prescribed format, we want to be able to identify those columns that have only one or two such values. Of course, the foolproof solution would compare the total rows in the table with the number of rows with the prescribed format (and it gets even more complicated when you consider NULLs), but I implicitly left that as an exercise for the reader!
John
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply