October 11, 2005 at 1:19 pm
I need to extract data from a pdf file to a sql server table.
Can I reference in acrobat reader and parse through the pdf?
Has anyone done this kind of thing? Any code samples?
Thanks
October 12, 2005 at 8:06 am
PDF files are 7-bit ASCII textfiles. They can be opened in any editor or wordprocessor like Microsoft Word or Wordpad unless the text has been compressed.
Every line in a PDF can contain up to 255 characters.
Every line ends with a carriage return, a line feed or a carriage return followed by a line feed (depending upon the application or platform used to create the PDF file).
PDF is case sensitive
People have extracted data using various third party tools, then pulled it in to SQL.
This may give you some ideas on the way you need to think about the data.
Mark "Studdy"
October 12, 2005 at 8:18 am
Hi Mark,
Thanks for the reply. The pdf file I'm dealing with is compressed, so I can't do a straight file read.
I did find some code at http://www.codeguru.com/forum/archive/index.php/t-26418.html that works well at getting the data.
Scott
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply