March 23, 2014 at 1:27 am
Hi everyone,
We are facing a problem with loading data from .pdf files from vendor.
.pdf files have data in tabular format and we would like to insert those fields into a SQL table.
We do not want to insert the physical location of the file but, we need to insert the data within the file.
How can we read a pdf file?
Thanks & Regards
March 24, 2014 at 12:51 am
SSC experts would definitely have an answer to this .. my wild guess though is :unsure: ....probably firstly converting it to excel and then reading that excel... or writing some code in some language eg. java .. or using some third party software to read text from PDF...
Please see if following helps
http://stackoverflow.com/questions/4784825/how-to-read-pdf-files-using-java
March 25, 2014 at 7:02 am
March 25, 2014 at 9:04 pm
Confusing Queries (3/23/2014)
Hi everyone,We are facing a problem with
loading data text from .pdf files[/url] from vendor..pdf files have data in tabular format and we would like to insert those fields into a SQL table.
We do not want to insert the physical location of the file but, we need to insert the data within the file.
How can we
read a pdf file[/url]?Thanks & Regards
If you want to read a pdf file, I think you might use some PDF reading utility. And as for this question, I think you can find answer in this post.
http://www.sqlservercentral.com/Forums/Topic1339455-148-1.aspx
As for "you want to insert data filed that is in tabular format into SQL table", maybe you can check this post
Hope it offers some useful help.:-D
March 27, 2014 at 6:47 am
In Adobe:
File>Save As>Text
PDF table will convert like this:
Arizona
5
Alabama
4
Kansas
9
Missouri
3
Montana
2
Read Text file, parse it out.
Or, you might look at this application: Winautomation. Macro software that can read and write to sql db. it's an excellent application, I've used to to do some web scraping to store in SQL.
March 27, 2014 at 11:20 am
This is just a shot in the dark, but try Googling or Binging the following:
+"sql server" +iFilter +PDF +text filestream semantic
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
March 28, 2014 at 11:45 am
On a non-technical level, has anyone asked the vendor what other formats they can send the data in? PDF is a print format for humans; having a computer pull data out of it is less than ideal compared to getting a fixed width text file, a delimited text file, or a variety of other formats.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply