August 31, 2004 at 3:40 pm
Hey all,
I am going to try out Full-Text Search that has over over 13 millions row. Is there anything that I need to know ahead? What are the things that I need to be expected, how bad the performance would be.
Is there any good architectural way to support 24x7 shop for text search?
Thank you in advance,
iLoveSQL
September 1, 2004 at 6:20 am
If at all possible use change tracking instead of full or incremental populations.
In your search query limit the amount of rows returned. This is the single greatest thing you can do to improve performance.
Consider partitioning your table to improver performance if you are joining on results coming back from the search with other restrictions.
Hilary Cotter
Looking for a book on SQL Server replication?
http://www.nwsu.com/0974973602.html
September 1, 2004 at 10:41 am
To answer your question directly, yes there is... Specifically, to achieve "High Availability" (HA) or 24x7 support for Full-Text Search (FTS), I'd recommend that you consider implementing SQL Server 2000 in a "Clustered environment" using Microsoft Custer Service (MSCS). SQL Server 2000 BOL title "Before Installing Failover clustering", "Creating a Failover Cluster" and specifically for FTS and MSCS, see "Using SQL Server Tools with Failover Clustering" and "Failover Cluster Troubleshooting"
You should also review the SQL 2000 Full-Text Search whitepaper (I was a *contributor*) KB Article 323739 INF: SQL Server 2000 Full-Text Search Deployment White Paper at http://support.microsoft.com/default.aspx?scid=kb;en-us;323739 for more detailed information regarding FT Indexing a table of 13 million rows. Additionally, you should review the following KB articles related to FTS:
SQL Server 2000 related KB Articles:
------------------------------------
240867 INF: How to Move, Copy, and Backup Full-Text Catalog Folders and Files
http://support.microsoft.com/default.aspx?scid=kb;EN-US;240867
246800 INF: Correctly Parsing Quotation Marks in FTS Queries
http://support.microsoft.com//default.aspx?scid=kb;EN-US;246800
263712 (Q263712) INF: How to Impede Windows NT Administrators from Administering a Clustered SQL Server [Information on BUILTIN\Administrators]
http://support.microsoft.com/default.aspx?scid=KB;EN-US;263712
277549 PRB: Unable to Build Full-Text Catalog After You Modify MSSQLServer Logon Account Through [NT4.0) Control Panel [or Win2K Component Services]
http://support.microsoft.com/default.aspx?scid=KB;EN-US;277549
240833 FIX: Full-Text Search Performance Improved via Support for TOP [Information on Top_N_Rank]
http://support.microsoft.com//default.aspx?scid=kb;EN-US;240833
286787 FIX: Incorrect Results From Full-Text Search on Several Columns [FIX specific to SQL Server 7.0, but is the default for SQL Server 2000]
http://support.microsoft.com/default.aspx?scid=kb;en-us;286787
Regards,
John T. Kane
September 1, 2004 at 10:57 am
This is great, thank you so much!!
BTW, We are already in MSCS so that is down
My one of the biggest concerns are performance degration whie doing indexing and also recovery disaster when index files are corrupted, I will go over the link to see if I can find that answer..
I also found this today
Building Search Applications for the Web Using Microsoft SQL Server 2000 Full-Text Search
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/sql_fulltextsearch.asp
September 1, 2004 at 11:26 am
You're welcome
As you referenced 24x7 support, I had suspected that you already had a clustered environment.
Yes, Full-Text Indexing (FTI) will take some time to complete for a 13 million row table even in a clustered environment as the primary bottleneck with FTI is the number of rows. I'd recommend that you place the FT Catalog on its own controller (and RAID0 or RAID10 disk array) and separate from your database files, if possible. For disaster recovery purposes, you may need to copy the FT Catalog folder and files to a save location via the procedures in KB article 240867. Note, SQL Server 2005 (codename Yukon) does a better job of this as it integrates FT Catalogs with the database backups and restores.
As for "Building Search Applications..." and specifically it's recommendation on implementing "Best Bets", you should be aware that this article recommends creating a duplicate table for the "best bet" documents (or rows) and then executing two simultaneous FTS query stored procs. Considering that you are starting with a 13 million row table and even if 2% of your documents (or rows) are considered "best bet" documents, then a duplicate table containing these rows would have nearly 4 million rows. Both the FT Indexing and FT Search simultaneous against two large FT-enable tables would be both difficult to maintain and may not perform up to your expectations.
However, there is an alternative that I've developed recently that does not require a duplicate table of "best bets", nor an additional table to join with the CONTAINSTABLE or FREETEXTTABLE clause that does not suffer from the above scalability concerns. Only an additional keyword column in the original FT-enabled table is required along with a UDF and a user specific keyword query table. Let me know of you're interested in such a "best bets" solution (mailto:jt-kane@comcast.net).
Regards,
John T. Kane
October 6, 2004 at 4:33 am
Dear expert,
Please kindly help clearing this confuse to me.
I read the FTS white paper and found this lines :
"A full-text search is more efficient at searching large tables (greater than 1 million rows) because it uses a pre-composed index. In comparison, a LIKE predicate must search for the string through the whole table at query time."
In my case, I have a web database that maxium size of tables is only around 1 thoundsands-10 thoundsands rows. I would like to add search function to the website. In your opion, is it more efficient to use FTS in this case instead of normal search?
I am eager to hearing from you.
Thank you very much for your response.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply