Extract Data from PDF

  • I need to extract data from a pdf file to a sql server table.

    Can I reference in acrobat reader and parse through the pdf?

    Has anyone done this kind of thing? Any code samples?

    Thanks

  • PDF files are 7-bit ASCII textfiles. They can be opened in any editor or wordprocessor like Microsoft Word or Wordpad unless the text has been compressed.

    Every line in a PDF can contain up to 255 characters.

    Every line ends with a carriage return, a line feed or a carriage return followed by a line feed (depending upon the application or platform used to create the PDF file).

    PDF is case sensitive

    People have extracted data using various third party tools, then pulled it in to SQL.

    This may give you some ideas on the way you need to think about the data.

    Mark "Studdy"

  • Hi Mark,

    Thanks for the reply. The pdf file I'm dealing with is compressed, so I can't do a straight file read.

    I did find some code at http://www.codeguru.com/forum/archive/index.php/t-26418.html that works well at getting the data.

    Scott

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply