June 7, 2005 at 3:49 pm
Does anybody have a stored procedure, an algorithm or know of any articles of how one would remove rows based on a table of stored words? So if there is a bad word or phrase in the table then the row is eliminated or marked as vulgar. The project is to remove vulgarities from a list of Email addresses.
Thanks for the info,
Steve
June 7, 2005 at 4:29 pm
Is there one specific column in the table that would potentially have the vulgar word, or is it spread out among multiple columns? It seems like a pretty straight-forward task either way - but the solution is going to be different.
You could use a while loop to scan through the rows and check for any that match the condition is LIKE '%' + TableWithVulgarWords.VulgarWordColumn + '%'. When the condition is met, you could move the record(s) over to the quarantine area for further review.
Cris
June 8, 2005 at 6:28 am
Just one column and your suggestion was what I was thinking of doing. I just didn't want reinvent the wheel if there is some useful advise.
Thanks
Steve
June 8, 2005 at 6:58 am
CREATE TABLE BadWords(Word varchar(4) PRIMARY KEY)
INSERT BadWords
SELECT 'bad'
UNION ALL SELECT 'word'
UNION ALL SELECT '#$!*'
CREATE TABLE Contacts(Id smallint IDENTITY PRIMARY KEY, Email varchar(80))
INSERT Contacts(Email)
SELECT 'sbad@opus.com'
UNION ALL SELECT 'good@egad.com'
UNION ALL SELECT '#$!*@cupro.net'
UNION ALL SELECT 'hiram@jpvh.net'
DELETE c
FROM Contacts c CROSS JOIN BadWords w
WHERE c.Email LIKE '%' + w.Word + '%'
--Jonathan
June 8, 2005 at 7:05 am
Wow, so simple.......!!!!!!!!
June 8, 2005 at 7:27 am
My solution so far is to create a table like this
CREATE TABLE [T_BAD_WORD] (
[BAD_ID] [int] NOT NULL ,
[BAD_WORD] [varchar] (255) NOT NULL
) ON [PRIMARY]
GO
Then Create a series of select and UNION them like this
Select 'SELECT [pers_id], , [email_invalid] FROM <my target table> Where like ''%' + [BAD_WORD] + '%'' UNION' FROM [dbo].[T_BAD_WORD]
Run that big UNION and get the list. Then they can be made invalid.
Now, the next question I'm being asked, "Is there a standard vulgarities list?". I am using the list we use to screen web sites and spam.
What a way to start the morning!
June 8, 2005 at 7:30 am
I don't have nor know where to get that list, but I think that if you convert the cross join from a delete to a select, that you will get your results much faster.
June 13, 2005 at 8:40 am
We do something very similar for the comments for our surveys. We do a cross join with our table of bad words. Also we have a field called ReplaceWith where we put in the characters or notation that we want to replace the bad word with. (i.e. Smack get replaced with [inappropriate] and other words will get replaced with [profanity], [insult], [sexual/hot comment] or whatever else the client wants us to replace it with.
To the best of my knowledge, there isn't a standard profanity list. We have built ours over time. I can send you our list if you are interested. But you probably already have a good start with the Anti-Spam list.
We are in the process of modifying our so that we can check for profane words in other languages too.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply