January 5, 2011 at 9:50 am
The tables and query built by the code provided below are intended to allow me to track the occurence of specific words listed in the table "badwords" when they occur in a syslog dump to the table (URLTBl).
My questions then are...
How should I index this table to enhance speed of the "charindex" procedure shown below...... on "URL" using a "full text index", or some other way?
And...
Can a trigger be created which will without significantly impacting overall database performance look at each row as it's entered into the URLTbl table by the syslog dump scanning for anything in the badwords table and when found copy the row to another table, or would something like this bog down the system too much?
--CHARINDEX PROCESS FOR FINDING BADWORDS
SELECT
CHARINDEX(RTRIM(B.Words), U.url),
*
FROM
#urltbl U
CROSS JOIN #Badwords AS B
WHERE
MSGDATE >= '12/10/2010' AND
/* using the next day because if you want all of 1/1 and there are any times
included you need to do less than the next day */
MSGDATE < '1/2/2011' AND
CHARINDEX(RTRIM(B.Words), U.url) <> 0
ORDER BY
username,
url
--To create #URLTbl and #BadWords temp tables run this entire section
IF OBJECT_ID('TempDB..#URLTbl','U') IS NOT NULL DROP TABLE #URLTbl
CREATE TABLE #URLTbl
(MsgDate Varchar(1024),
Username Varchar (1024),
URL Varchar (1024),
Rating Varchar (1024))
--===== Setup any special required conditions especially where dates are concerned
SET DATEFORMAT YMD
INSERT INTO #URLTbl(MsgDate,Username,URL,Rating)
SELECT '2010-11-10','webuser\patr','http://view.atdmt.com/action/porn/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-10','webuser\mhoag','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-10','webuser\hlavaty3398n','http://content.murder.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-10','webuser\matth','http://assets3b.msnbc.msn.com/rendering/img/bing.png','news' union all
SELECT '2010-11-10','webuser\patr','http://view.atdmt.com/action/kill/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-10','webuser\mmhoag','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-10','webuser\hmlavaty3398n','http://content.porn.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-10','webuser\mamtth','http://assets3b.msnbc.msn.com/fight/rendering/img/bing.png','news' union all
SELECT '2010-11-10','webuser\patmr','http://view.atdmt.com/action/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-10','webuser\mhoamg','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-10','webuser\hlavamty3398n','http://content.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-10','webuser\mattmh','http://assets3b.msnbc.msn.com/rendering/img/bing.png','news'union all
SELECT '2010-11-10','webuser\patrm','http://view.atdmt.com/action/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-12','webuser\mhoamg','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-12','webuser\hlavmaty3398n','http://content.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-12','webuser\mattmh','http://assets3b.msnbc.msn.com/rendering/img/bing.png','news' union all
SELECT '2010-11-15','webuser\patrm','http://view.atdmt.com/action/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-15','webuser\mhoamg','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-15','webuser\hlavmaty3398n','http://content.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-15','webuser\mattmh','http://assets3b.msnbc.msn.com/rendering/img/bing.png','news'union all
SELECT '2010-11-17','webuser\patr','http://view.atdmt.com/action/porn/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-10','webuser\mhoag','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-17','webuser\hlavaty3398n','http://content.murder.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-10','webuser\matth','http://assets3b.msnbc.msn.com/rendering/img/bing.png','news' union all
SELECT '2010-11-17','webuser\patr','http://view.atdmt.com/action/kill/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-10','webuser\mmhoag','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-21','webuser\hmlavaty3398n','http://content.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-21','webuser\mamtth','http://assets3b.msnbc.msn.com/fight/rendering/img/bing.png','news' union all
SELECT '2010-11-21','webuser\patmr','http://view.atdmt.com/action/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-10','webuser\mhoamg','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-10','webuser\hlavamty3398n','http://content.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-10','webuser\mattmh','http://assets3b.msnbc.msn.com/rendering/img/bing.png','news'union all
SELECT '2010-11-21','webuser\patrm','http://view.atdmt.com/action/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-21','webuser\mhoamg','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-10','webuser\hlavmaty3398n','http://content.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-10','webuser\mattmh','http://assets3b.msnbc.msn.com/rendering/img/bing.png','news' union all
SELECT '2010-11-10','webuser\patrm','http://view.atdmt.com/action/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-10','webuser\mhoamg','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-10','webuser\hlavmaty3398n','http://content.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-10','webuser\mattmh','http://assets3b.msnbc.msn.com/rendering/img/bing.png','news'union all
SELECT '2010-11-10','webuser\patr','http://view.atdmt.com/action/porn/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-10','webuser\mhoag','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-10','webuser\hlavaty3398n','http://content.murder.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-10','webuser\matth','http://assets3b.msnbc.msn.com/rendering/img/bing.png','news' union all
SELECT '2010-11-10','webuser\patr','http://view.atdmt.com/action/kill/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-10','webuser\mmhoag','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-10','webuser\hmlavaty3398n','http://content.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-10','webuser\mamtth','http://assets3b.msnbc.msn.com/fight/rendering/img/bing.png','news' union all
SELECT '2010-11-10','webuser\patmr','http://view.atdmt.com/action/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-10','webuser\mhoamg','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-25','webuser\hlavamty3398n','http://content.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-25','webuser\mattmh','http://assets3bomb.msnbc.msn.com/rendering/img/bing.png','news'union all
SELECT '2010-11-25','webuser\patrm','http://view.atdmt.com/action/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-25','webuser\mhoamg','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-25','webuser\hlavmaty3398n','http://content.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-25','webuser\mattmh','http://assets3b.msnbc.msn.com/rendering/img/bing.png','news' union all
SELECT '2010-11-25','webuser\patrm','http://view.atdmt.com/action/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-25','webuser\mhoamg','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-25','webuser\hlavmaty3398n','http://content.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-29','webuser\mattmh','http://assets3b.msnbc.msn.com/rendering/img/bing.png','news'union all
SELECT '2010-11-29','webuser\patr','http://view.atdmt.com/action/porn/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-29','webuser\mhoag','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-29','webuser\hlavaty3398n','http://content.murder.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-29','webuser\matth','http://assets3b.msnbc.msn.com/rendering/img/bing.png','news' union all
SELECT '2010-11-29','webuser\patr','http://view.atdmt.com/action/kill/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-29','webuser\mmhoag','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-29','webuser\hmlavaty3398n','http://content.yieldmanagergun.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-29','webuser\mamtth','http://assets3b.msnbc.msn.com/fight/gun/rendering/img/bing.png','news' union all
SELECT '2010-11-07','webuser\patmr','http://view.atdmt.com/action/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-07','webuser\mhoamg','http://cdn.unicast.msn.com/porn/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-07','webuser\hlavamty3398n','http://content.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-07','webuser\mattmh','http://assets3b.msnbc.msn.com/rendering/img/bing.png','news'union all
SELECT '2010-11-07','webuser\patrm','http://view.atdmt.com/action/MSN_Homebombpage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-07','webuser\mhoamg','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C181757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-07','webuser\hlavmaty3398n','http://content.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-07','webuser\mattmh','http://assets3b.msnbc.msn.com/rendering/img/bing.png','news' union all
SELECT '2010-11-07','webuser\patrm','http://view.atdmt.com/action/nude/MSN_Homepage_Remessaging_111808/nc?a=1','web banners' UNION ALL
SELECT '2010-11-07','webuser\mhoamg','http://cdn.unicast.msn.com/assets/A1/N6/M8379/C18naked1757/large.jpg?0.5824313560398167','web search' UNION ALL
SELECT '2010-11-07','webuser\hlavmaty3398n','http://content.bomb.yieldmanager.com/ak/q.gif','internet services' UNION ALL
SELECT '2010-11-10','webuser\mattmh','http://assets3b.msnbc.msn.com/fight/rendering/img/bing.png','news'
--Create Badwords table
IF OBJECT_ID('TempDB..#badwords','U') IS NOT NULL DROP TABLE #badwords
CREATE TABLE #Badwords
(Words nchar(25))
--===== Setup any special required conditions especially where dates are concerned
SET DATEFORMAT YMD
--===== Insert the test data into the test table
INSERT INTO #badwords
(words)
SELECT 'porn ' UNION ALL
SELECT 'nude ' UNION ALL
SELECT 'naked ' UNION ALL
SELECT 'gun ' UNION ALL
SELECT 'bomb ' UNION ALL
SELECT 'kill ' UNION ALL
SELECT 'murder '
--===== Set the identity insert back to normal
-- SET IDENTITY_INSERT #badwords ON
January 5, 2011 at 10:04 am
I wouldn't do that in a trigger, usually. I'd be more inclined to use a job with a frequent schedule. Otherwise, you hold up the insert transaction till the trigger is complete, and that can have locking issues that are hard to plan/account for.
Instead of charindex, I've found joining on a Like construct works pretty well.
select *
from Table1
inner join WordsTable
on Table1.Column like '%' + Word + '%';
Try both, see what performs better.
You won't really be able to use a full text index on this, because URLs don't work well with that. They're strings, not sentences or phrases.
Also, you may need to take into account modified word constructs. For example, you include "bomb", but don't seem to have "b0mb". Not sure if that applies to your situation, but when I was generating semi-random marketing codes with alphanumerics, it was one of the bigger issues I had to deal with.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply