Installing pdf Ifilter for full-text searching

  • I would like to search pdf documents stored in the database.   I'm using SS2k on Windows Server 2003.   I downloaded the adobe pdf IFilter v6.0 and installed it on the database server, however, the instructions for installation are meant for the Indexing Service and not the MS Search Service.   I haven't tested the search yet because we currently don't have any pdf documents uploaded in the database table.   I would like to know if anyone has installed this filter and if there is something additional I need to do in order for MS Search service to recognize the pdf filter.  Thanks.   

  • hi Rob,

    Yes, many have succesfully used the 32-bit Adobe PDF IFilter using SQL 7.0, SQL 2000 and SQL 2005 on 32-bit WinXP and Win2003. Below is some SQL code that demostrates how to do this on SQL 2000 with an uploaded PDF file:

    use pubs

    go

    if exists (select * from sysobjects where id = object_id('FTSTable'))

      drop table FTSTable

    go

    CREATE TABLE FTSTable (

      KeyCol int IDENTITY (1,1) NOT NULL

        CONSTRAINT FTSTable_IDX PRIMARY KEY CLUSTERED,

      TextCol text NULL,

      ImageCol image NULL,

      ExtCol char(3) NULL, -- can be either sysname or char(3)

      TimeStampCol timestamp NULL

    ) ON [PRIMARY]

    go

    -- Insert data... (Note: Initalizing IMAGE column with 0xFFFFFFFF for use with TextCopy.exe)

    INSERT FTSTable values('Test TEXT Data for row 6', 0xFFFFFFFF, 'pdf', NULL)

    go

    declare @query varchar(200)

    -- Insert HTML_file.htm into Row 5 !!

    -- NOTE: Ensure the correct path for textcopy.exe!!

    set @query = 'D:\MSSQL80\MSSQL$SQL80\Binn\textcopy /s '+@@servername+' /u sa /p /d pubs /t FTSTable /c ImageCol /f D:\SQLFiles\Shiloh\<PDF_file>.htm /i /k 5000 /w "where KeyCol=6"'

    print @query

    exec master..xp_cmdshell @query

    go

    -- Select data

    SELECT * from FTSTable

    go

    -- FTI

    exec sp_fulltext_database 'enable'

    go

    exec sp_fulltext_service 'clean_up'

    exec sp_fulltext_catalog 'FTSCatalog','create'

    exec sp_fulltext_table 'FTSTable','create','FTSCatalog','FTSTable_IDX'

    exec sp_fulltext_column 'FTSTable','ImageCol','add', 0x0409, 'ExtCol'

    exec sp_fulltext_column 'FTSTable','TextCol','add'

    exec sp_fulltext_table 'FTSTable', 'activate' 

    go

    -- Start FT Indexing...

    exec sp_fulltext_catalog 'FTSCatalog','start_full'

    go

    -- Wait for FT Indexing to complete and check NT/Win2K Application log for success/errors..

    select * from FTSTable

    go

    -- Search for search_word_here in HTML file..

    select KeyCol, ImageCol  from FTSTable where contains(*,'<search_word_here>') order by KeyCol

    go

    -- Search for search_word_here in .DOC file...

    select KeyCol, ImageCol from FTSTable where contains(*,'<search_word_here>') order by KeyCol

    go

    -- Confirm FT Properties...

    use pubs

    go

    sp_help_fulltext_catalogs 'FTSCatalog'

    go

    sp_help_fulltext_tables 'FTSCatalog' 

    go

    sp_help_fulltext_columns 'FTSTable'

    go

    SELECT fulltextcatalogproperty('FTSCatalog', 'PopulateStatus')

    go

    -- Remove FT Indexes & Catalog & table..

    exec sp_fulltext_table 'FTSTable','drop'

    exec sp_fulltext_Catalog 'FTSCatalog','drop'

    drop table FTSTable

    Regards,

    John

    SQL Full Text Search Blog

    http://jtkane.spaces.live.com/


    John T. Kane

  • John, thank you very much for your response, it was immensely helpful.  I used your example to test the pdf filter and it worked very well.   In addition, we have MS SharePoint catalogs that haven't been able to search pdf files for quite awhile and now they can.  Thanks again.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply