December 23, 2006 at 11:12 am
Hi All,
I have a table that stores ASCII text in a column of type text in a SQL 2000 database. I need to compare new records with existing in order to avoid duplicate entries. The table has slightly over 20,000 records and grows at a rate of about 15-20 records per day.
I understand that SQL Server 2000 doesn't allow something like
if (NOT Exists(Select ID from the_table where the_text = the_text))
How can I compare text columns to avoid duplicates?
Thanks,
Steve
December 23, 2006 at 4:38 pm
Have you tried this script in the library:
http://www.sqlservercentral.com/scripts/viewscript.asp?scriptid=1527
December 26, 2006 at 3:35 pm
Thanks for your response Diane but I don't think this is giving me what I need. Of the 20,000+ records in this table, only about 14,000 are unique. However, using the function generated by the script you suggested creates an interger value which is unique for each record. In other words, it doesn't identify any duplicates in the table.
select distinct dbo.fn_CheckSum_Text(my_text) from my_table
returns every record in the table.
Am I doing something wrong or is there another way check text fields to ensure that duplicate records are prevented?
Steve
December 27, 2006 at 8:20 am
Not especially efficient, but you can do
WHERE CHARINDEX(, ) = 1
December 27, 2006 at 8:24 am
Hmmn, I put the greater-than / less-than brackets in last post, they were stripped out....
WHERE CHARINDEX(The_Text, TextCol) = 1
December 27, 2006 at 9:36 am
I assume that the text you are comparing could be greater then 8000 char. If not, a redefinition to varchar(8000) could solve your problem.
or you could compare the first 8000 char using substring
December 28, 2006 at 7:00 am
Some of the records are considerably more than 8,000 characters .
December 28, 2006 at 12:51 pm
Then you could always break the statement into multiple filters
where substring(a.text, 1, 8000) = substring( b.text, 1, 8000)
and substring(a.text, 8001, 8000) = substring( b.text, 8001, 8000)
and substring(a.text, 16001, 8000) = substring( b.text, 16001, 8000)
(dont forget the isnull around each or a check that datalength of each is right for substring size)
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply