December 1, 2008 at 11:08 pm
Comments posted to this topic are about the item Finding Values With Numerous Columns
---
Timothy A Wiseman
SQL Blog: http://timothyawiseman.wordpress.com/
December 2, 2008 at 2:13 am
Good one ...
December 2, 2008 at 2:25 am
If you are happy to exclude index performance benefits or doing a like comparison you can create a computed column of the concatenation of other columns and then search the computed column.
ALTER TABLE phonebook ADD
phoneall AS ',' + isnull(phone1, '') + ',' +
isnull(phone2, '') + ',' +
isnull(workphone, '') + ',' +
isnull(cellphone, '') + ','
SELECT
*
FROM
phonebook
WHERE
phoneall like '%,234-5678,%'
December 2, 2008 at 2:54 am
This is an interesting article. What is most important is that it explains WHY there can be a problem, for anyone who has never been in a position of trying unsuccessfully to curb developers who possess just a little knowledge of Database Design but who are completely unaware of just how little.
As far as I can see, the solution that Timothy shows will work well in cases where the database has a lot of small spreadsheet-like tables, (Timothy seems to have suffered from this) but it will not work for monster tables unless it has some very clever indexing.
There is a different solution that works very well, which I've had to use myself many times, though it is not a good idea for a rapidly changing table. It is, however, very fast where there is lots of text to search in several columns, which would otherwise need a '%xxxx%' wWHERE-clause. (i.e. unindexable!) This is to use an 'inversion' table.
this technique is usually called the ‘Inverted’ or ‘Inversion’ index technique. (see http://en.wikipedia.org/wiki/Index_(search_engine)%5B/url%5D for a full discussion.
Basically, you produce table that contains, uniquely, a list of every word in the columns you want to index. You maintain a many-to-many linking table that links the row in the denormalized table with the words in the unique table that it contains. This gives you, instantly, the rows containing the words in the 'search string' that you want to search for, even in tables that are several million rows long. You can refine your search from that subset, and Timothy's method should work fine for that. I've never tried to make this into a generic solution, as I'm not that brave, nor unfortunate enough to have more than one travesty of a denormalized table like the ones Timothy describes, within a single database!
If anyone is interested, I'll pop the solution into a blog post, but it is too long for a forum post, I reckon..
Best wishes,
Phil Factor
December 2, 2008 at 5:24 am
... or a nice article on SSC, Phil. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
December 2, 2008 at 6:11 am
Thanks for the article as this does come up, especially when dealing with others' legacy databases. I would be interested in seeing this expanded to include text fields, as so often a varchar is used to capture field data, then it overspills into a catch-all notes field.
December 2, 2008 at 6:31 am
Jeff Moden (12/2/2008)
... or a nice article on SSC, Phil. 😉
Speaking for the vast silent majority, I'd like to read Phil's article. 😀
Paul DB
December 2, 2008 at 6:41 am
... or a nice article on SSC, Phil. [Wink]
Speaking for the vast silent majority, I'd like to read Phil's article. [BigGrin]
OK. It's a deal. I needed a good excuse to publish it! 🙂
Best wishes,
Phil Factor
December 2, 2008 at 7:11 am
Just want to point out that ISNUMERIC and ISDATE is not reliable
SELECT ISNUMERIC('12d5'),ISDATE('2000')
Failing to plan is Planning to fail
December 2, 2008 at 7:55 am
This doesn't work for a LIKE condition, but for equalities in a small number of fields, you can use
WHERE '411-555-1212' IN (HomePhone, WorkPhone, CellPhone, AltPhone)
I've found it useful when looking for a foreign key of a Person Id in several fields like fkLoanOfficerId, fkApprovingOfficerId, fkVerifiedBy etc.
You can't let the perfect be the enemy of the good. In some cases it's more expedient to work with the errors of the past, than to attempt a total rewrite of the structure.
December 2, 2008 at 8:17 am
Phil Factor (12/2/2008)
... or a nice article on SSC, Phil. [Wink]
Speaking for the vast silent majority, I'd like to read Phil's article. [BigGrin]
OK. It's a deal. I needed a good excuse to publish it! 🙂
Phil is it comming in Jan '09....
December 2, 2008 at 9:11 am
Madhivanan and Jason bring up some good points. Formating! Then if you like LIKE you'll love this. Try a search argument with an embedded RegEx. Search for '*e*' and stand by for a boat load of rows.
I'm pushing to store phone numbers as BigInt. I don't have to dial the dashes why should I have to store them? Oh, and three fields too. Country code, area code, phone number. OK, I know that there are letter on the phone but the phone system could care less.
I have customers that use ISO 8601 dates (20081202 for today). Goes in a char(8). On one hand no messy times. 🙂 A ship date is a ship date. On the other try doing calculations by week. 🙁 I plan on finding a real good reason for these folks to migrate from 2000 to 2008. The date type has me all a twitter.
ATBCharles Kincaid
December 2, 2008 at 9:39 am
I guess I like the code, so I tweaked it a bit into a SELECT only, and added 'column' filters
so you can only search in SELECTED columns (if you know the names ahead)
It's funny we are trying to simulate Full-Text search
AdventureWorks2008 DB
SET @schema = 'Person'
SET @TableName = 'Person'
SET @Value = 'Xu%'
SET @ColumnNames = 'FirstName,LastName'-- can be empty or * for ALL columns
/*
CREATE PROCEDURE [dbo].[FindValue]
@TableName NVARCHAR(128), /* Must be a valid table or view name,
must not be quoted or contain a schema*/
@Value NVARCHAR(4000), /*May contain wildcards*/
@schema NVARCHAR(128) = 'dbo' /*May be left out*/
AS
Sample Execution
Exec FindValue
@TableName = 'spt_monitor',
@Value = '8',
@schema = 'dbo'
*/
/*
If given a string it will finds all rows where any char, varchar, or
their Unicode equivalent which contain that string in the selected
table or view. Note that this only works on objects which have entries
in information_schema.columns, which excludes certain system objects. If
given a numeric value it will check those text types for a match as well
as numeric types. If given a possible date, it will also check date type.
The string that is being searched for may contain wildcard characters such as %.
This will NOT search text, ntext, xml, or user defined fields. This may
return a row more than once if the search string is found in more than one
column in that row.
*/
DECLARE
@TableName NVARCHAR(128), /* Must be a valid table or view name,
must not be quoted or contain a schema*/
@Value NVARCHAR(4000), /*May contain wildcards*/
@schema NVARCHAR(128) /*May be left out*/
,@ColumnNames NVARCHAR(4000)-- list of columns to search for, can be * for ALL
SET @schema = 'Person'
SET @TableName = 'Person'
SET @Value = 'Xu%'
SET @ColumnNames = 'FirstName,LastName'-- can be empty or * for ALL columns
SET @ColumnNames = REPLACE(@ColumnNames, ' ', '')-- removes all space
/**************************** Declare Variables ***********************/
DECLARE @columns TABLE (ColumnName NVARCHAR(128))
DECLARE @columnsFiltered TABLE (ColumnName NVARCHAR(128))
DECLARE @sql NVARCHAR(MAX)
/************************** Populate Table Variable *****************/
/*Takes the names of string type columns for the selected table */
INSERT INTO @columns
(ColumnName)
SELECT
Column_name
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
Table_schema = @schema
AND Table_name = @TableName
AND data_type IN ('char', 'nchar', 'varchar', 'nvarchar')
/* If it is numeric, also check the numeric fields */
IF ISNUMERIC(@value) = 1
INSERT INTO @columns
(ColumnName)
SELECT
Column_name
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
Table_schema = @schema
AND Table_name = @TableName
AND data_type IN ('int', 'numeric', 'bigint', 'money',
'smallint', 'smallmoney',
'tinyint', 'float', 'decimal', 'real')
IF ISDATE(@value) = 1
INSERT INTO @columns
(ColumnName)
SELECT
Column_name
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
Table_schema = @schema
AND Table_name = @TableName
AND data_type IN ('datetime', 'smalldatetime')
INSERT INTO @columnsFiltered
SELECT ColumnName
FROM @columns
WHERE
(@ColumnNames IN ('*','')
OR
CHARINDEX(',' + ColumnName + ',', ',' + @ColumnNames + ',') > 0
)
/********************* Prepare dynamic SQL Statement to Execute **********/
SELECT
@sql =
CASE
WHEN @sql IS NULL
THEN 'Select ''' + ColumnName
+ ''' as ContainingColumn, * From '
+ QUOTENAME(@Schema) + '.' + QUOTENAME(@TableName)
+ ' where ' + ColumnName + ' like ''' + @Value + ''' '
WHEN @sql IS NOT NULL
THEN @sql + 'UNION ALL Select ''' + ColumnName
+ ''' as ContainingColumn, * From ' + QUOTENAME(@Schema)
+ '.' + QUOTENAME(@TableName)
+ ' where ' + ColumnName + ' like ''' + @Value + ''' '
END
FROM
@columnsFiltered
/******************* Execute Statement and display results ***********/
--print @sql /* This may be uncommented for testing purposes */
EXEC (@sql)
December 2, 2008 at 11:11 am
Jeff Moden (12/2/2008)
... or a nice article on SSC, Phil. 😉
I'll hop on the bandwagon and say I would love to see this article, Phil.
And you are quite correct. I originally wrote the procedure to help me deal with large numbers of "spreadsheet-like" tables with swaths of repeating columns. I am not aware of it actually "failing" on a large table, but it can definitely be painfully slow on large tables
---
Timothy A Wiseman
SQL Blog: http://timothyawiseman.wordpress.com/
December 2, 2008 at 11:56 am
Jason Akin (12/2/2008)
This doesn't work for a LIKE condition, but for equalities in a small number of fields, you can use
WHERE '411-555-1212' IN (HomePhone, WorkPhone, CellPhone, AltPhone)
I've found it useful when looking for a foreign key of a Person Id in several fields like fkLoanOfficerId, fkApprovingOfficerId, fkVerifiedBy etc.
You can't let the perfect be the enemy of the good. In some cases it's more expedient to work with the errors of the past, than to attempt a total rewrite of the structure.
+1. The IN list of columns is far better than a bunch of OR conditions. Likewise, if LIKEs are necessary this would be a better alternative:
WHERE (HomePhone+'|'+WorkPhone+'|'+CellPhone+'|'+AltPhone) like '%555-1212%'
Viewing 15 posts - 1 through 15 (of 26 total)
You must be logged in to reply to this topic. Login to reply