How to do image field content search in SQL

  • Hi everyone,
     
     
    I have a silly question about SQL, I wonder if it is possible using existing technique to accomplish it.
     
    I have a binary field(e.g. Image) in SQL, I need to store image file(scanned from original document) in that field. I don't think it is possible but my boss want me to give him at least an alternative solution: Can I search the text in this field?
     
    To make it clear:
     
    I scan a paper document, I get a jpeg file, the original file contains: text object, handwritting object. There is a word "Bush" in the document. I store this jpeg file in a field (type of Image) in SQL database, Now I want to do a search for "Bush" in that field in the database.
     
    Can I do that?
     
    I told him it is impossible to do that but you know he is looking for kind of alternative solution, he doesn't care money.
     
    I also told him I can search for text object, if you put some description on that image field, then I can search those description. But he want to search the whole document.
     
    Can I do sort of recognization and extract content from the scanned jpeg, and then store these extracted content in sql so that I can do some search?
     
     
    Thanks.
     
     
  • Your timing couldn't be much better :

    OCR

    This is not a final answer but you'll know what you're looking for at least.

  • Thanks for your quick response.

    Do you know what's the scanned and OCRed file type I can get? a word .doc? Then I save this file as an image object into SQL, then how to search the content?

    Or I just need the OCRed plain text, and save them in varchar field and search them there?

     

     

     

  • You can always open a doc file, and select all the text, and do an insert into a text field, which you can then index in a full text catalog. You need to get to the text doc and then you're fine. Now the question is, do you still have those docs hidden somewhere in a backup?

  • Yes, those docs will still be neccesary for people to download, so they will be stored in database too.

  • So what do you have access to exactly?

  • Luckily, SQL Server 2005 reportedly will have a new operator LooksLike that will find images that look like other images or items described in strings. For example:

    Select * from myScans

    where myImage LOOKSLIKE 'banana'

    would find all pictures where there is a banana filling most of the frame, while

    Select * from myScans

    where myImage LOOKSLIKE '%banana%'

    would find a smaller banana embedded within the image.

    Complex search criteria are also supported:

    Select * from myScans

    where myImage LOOKSLIKE '%a red or green bicycle with a horn%'

  • hahahaha

    Careful, someone might lose their job if they took that back to their boss with a straight face only to find his colleagues laughing at him!

    Full marks for the serious looking examples though

    My HRM database's most popular query was....

    select * from myScans

    where myImage LOOKSLIKE '%former%dba%'

    would find all former DBAs, as well as those who were former programmers, ice cream salesman, weight loss exercise equipment advertisement folk, etc and who are now DBAs 

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply