October 28, 2005 at 1:26 pm
Probably we can't trick the nature, but anyhow. Is there any way (tips) to optimize query with LIKE, i.e. select id from table1 where url like '%strtosearch%' ? Sometimes this query is pretty fast but mainly... Unfortunately need to add % to both sides of searching string. Or the only decision is to use Full Text Search? Thanks
October 28, 2005 at 1:29 pm
Full-text search uses like 'string%'. What do you need to search exactly?
October 28, 2005 at 1:48 pm
The reason its slow is because by using a wildcard in the beginning of the string eliminates all chance of using an index to eliminate any candidate rows.
Full text indexing is faster in some respects, but it takes additional work to implement, but i'm not sure havn't used it yet
October 28, 2005 at 1:54 pm
Table I mentioned (pretty big ~15 million rows) has url column varchar(800). Query is looking for records (id) where this url contains some string (sometimes 15-20 characters). I know that wildcard from both side is a real pain but, unfortunately, can not get rid of them. There is nonclustered index on url field but I am not sure it helps in this query. Probably, we can not do a lot in this case. Thanks.
October 28, 2005 at 1:57 pm
It probably doesn't apply in your case but if a clustered index is placed on the field that has the LIKE condition it works quite well because clustered indexes are good with range finds.
If your table is fairly static and you need an additional clustered index (only one allowed per table) then consider creating a view and putting a clustered index on that.
I would read up about indexed views first because there are some additional caveats on them such as the various options such as SET ANSI_NULLS etc.
October 28, 2005 at 2:01 pm
Can you split the url into its parts.
protocol = https://
DomainName = http://www.SubDomain.DomainName
Folders = new tables with all the folders
FileName = Whatever.aspx
parameters = new table or string with the parameters.
That way you could index all the parts of the urls and do search on indexed columns (assuming the search can be modified to fit this design).
October 28, 2005 at 2:02 pm
I can't call this table static as new records with new url are inserted pretty often. Thanks
October 28, 2005 at 2:08 pm
Probably later we can think about it (problem is that string can be from domain, folder, file in your design approach). But currently I have what I have- whole url. Thanks
October 28, 2005 at 2:12 pm
No offense but that data is pretty easy to split (assuming the url is in a valid url format). And then the search engine wouldn't be too hard to create, but I don't know the rest of the requirements so I have no clue how well this would work.
October 28, 2005 at 2:51 pm
And you cannot have any other filter criteria in this query besides the url filter criteria ? Can't you restrict based on another filter criteria that is on an indexed column, something like:
select id
from table1
where url like '%strtosearch%'
and mod_date_time > @start_date
and mod_date_time < @end_date
or maybe a status or some other column.
That would prevent it from going through all the records...are you really interested in looking through all the 15 million+ rows ? The only other solution is what Remi suggested i.e. break the URL into parts and store as computed columns that are then indexed to make the access faster.
October 28, 2005 at 3:02 pm
Just a warning: url parsing can be a tricky business specially in shops that use url remapping
* Noel
October 28, 2005 at 3:04 pm
Care to elaborate?
October 28, 2005 at 3:13 pm
Sure!
There are many websites that for security reasons perform remaping of the urls depending to what pages you want to access and those modules are usually plugged in as http modules. It makes it harder for hackers to figure out the asp or php pages that are being called. Then if you come up with a logic to perform the "unmapping" you will also need to update your code every time the map function changes
* Noel
October 28, 2005 at 3:15 pm
I see .
October 28, 2005 at 3:18 pm
Sorry guys, didn't see new posts. Anyhow, the actual query is a join:
select t1.id
from table1 t1
inner join table2 t2 on t2.id = t1.contentid
where t1.custid = @custid and t1.deleted = 'N' and t1.statusid != @newstatus
and t2.url like @urlstr
Table t2 has ~15 million row, clustered index on id; Table t1 has ~80,000 rows, nonclustered index on url. Just playing with indexes
Thanks
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply