December 19, 2006 at 3:57 pm
I would like to search pdf documents stored in the database. I'm using SS2k on Windows Server 2003. I downloaded the adobe pdf IFilter v6.0 and installed it on the database server, however, the instructions for installation are meant for the Indexing Service and not the MS Search Service. I haven't tested the search yet because we currently don't have any pdf documents uploaded in the database table. I would like to know if anyone has installed this filter and if there is something additional I need to do in order for MS Search service to recognize the pdf filter. Thanks.
December 20, 2006 at 9:40 am
hi Rob,
Yes, many have succesfully used the 32-bit Adobe PDF IFilter using SQL 7.0, SQL 2000 and SQL 2005 on 32-bit WinXP and Win2003. Below is some SQL code that demostrates how to do this on SQL 2000 with an uploaded PDF file:
use pubs
go
if exists (select * from sysobjects where id = object_id('FTSTable'))
drop table FTSTable
go
CREATE TABLE FTSTable (
KeyCol int IDENTITY (1,1) NOT NULL
CONSTRAINT FTSTable_IDX PRIMARY KEY CLUSTERED,
TextCol text NULL,
ImageCol image NULL,
ExtCol char(3) NULL, -- can be either sysname or char(3)
TimeStampCol timestamp NULL
) ON [PRIMARY]
go
-- Insert data... (Note: Initalizing IMAGE column with 0xFFFFFFFF for use with TextCopy.exe)
INSERT FTSTable values('Test TEXT Data for row 6', 0xFFFFFFFF, 'pdf', NULL)
go
declare @query varchar(200)
-- Insert HTML_file.htm into Row 5 !!
-- NOTE: Ensure the correct path for textcopy.exe!!
set @query = 'D:\MSSQL80\MSSQL$SQL80\Binn\textcopy /s '+@@servername+' /u sa /p /d pubs /t FTSTable /c ImageCol /f D:\SQLFiles\Shiloh\<PDF_file>.htm /i /k 5000 /w "where KeyCol=6"'
print @query
exec master..xp_cmdshell @query
go
-- Select data
SELECT * from FTSTable
go
-- FTI
exec sp_fulltext_database 'enable'
go
exec sp_fulltext_service 'clean_up'
exec sp_fulltext_catalog 'FTSCatalog','create'
exec sp_fulltext_table 'FTSTable','create','FTSCatalog','FTSTable_IDX'
exec sp_fulltext_column 'FTSTable','ImageCol','add', 0x0409, 'ExtCol'
exec sp_fulltext_column 'FTSTable','TextCol','add'
exec sp_fulltext_table 'FTSTable', 'activate'
go
-- Start FT Indexing...
exec sp_fulltext_catalog 'FTSCatalog','start_full'
go
-- Wait for FT Indexing to complete and check NT/Win2K Application log for success/errors..
select * from FTSTable
go
-- Search for search_word_here in HTML file..
select KeyCol, ImageCol from FTSTable where contains(*,'<search_word_here>') order by KeyCol
go
-- Search for search_word_here in .DOC file...
select KeyCol, ImageCol from FTSTable where contains(*,'<search_word_here>') order by KeyCol
go
-- Confirm FT Properties...
use pubs
go
sp_help_fulltext_catalogs 'FTSCatalog'
go
sp_help_fulltext_tables 'FTSCatalog'
go
sp_help_fulltext_columns 'FTSTable'
go
SELECT fulltextcatalogproperty('FTSCatalog', 'PopulateStatus')
go
-- Remove FT Indexes & Catalog & table..
exec sp_fulltext_table 'FTSTable','drop'
exec sp_fulltext_Catalog 'FTSCatalog','drop'
drop table FTSTable
Regards,
John
SQL Full Text Search Blog
http://jtkane.spaces.live.com/
John T. Kane
December 20, 2006 at 2:03 pm
John, thank you very much for your response, it was immensely helpful. I used your example to test the pdf filter and it worked very well. In addition, we have MS SharePoint catalogs that haven't been able to search pdf files for quite awhile and now they can. Thanks again.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply