You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by sudarshan angirash <an...@gmail.com> on 2006/07/12 13:37:07 UTC

query for search through lucene for BLOB

hi all

i have some PDF files stored in Oracle 9i as BLOB.
now i want to search for a string in those pdf files using Lucene. then i
want to show the selected PDF files which contains The String.

if you can give me any pointers about how to do it, then it will be a gr8
help for me.

regards
sudarshan

Re: query for search through lucene for BLOB

Posted by Fernando Engelmann Junior <fe...@softexpert.com>.
I think the best solution for your problem is using the oracle FTS 
engine. It gives you this function. I donĀ“t know if you can do what you 
want directly with Lucene like oracle solution does.


sudarshan angirash wrote:
> hi all
>
> i have some PDF files stored in Oracle 9i as BLOB.
> now i want to search for a string in those pdf files using Lucene. then i
> want to show the selected PDF files which contains The String.
>
> if you can give me any pointers about how to do it, then it will be a gr8
> help for me.
>
> regards
> sudarshan
>


-- 



Atenciosamente
------------------------------------------------------------------------
*Fernando Engelmann Junior*
/Analista de Sistemas
SoftExpert Quality Software
+55 (47) 21019955
http://www.softexpert.com
/
------------------------------------------------------------------------

Re: query for search through lucene for BLOB

Posted by Steven Rowe <sa...@syr.edu>.
Hi Sudarshan,

When your question is Java usage related, you will almost certainly get 
better responses by asking just on the Java User list.  Oddly enough, 
hitting all of the mailing lists for the project at once with the same 
question is likely to *reduce* your chances of getting polite/on-topic 
responses.

To index PDF files, you need to first extract their content, and Lucene 
does not do this for you.  Here is the Lucene FAQ entry on the topic:

<http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-c45f8b25d786f4e384936fa93ce1137a23b7e422>

Chapter 7 of the excellent book Lucene in Action 
<http://lucenebook.com/> also covers this topic.

Once you have extracted the text for a document, you'll want to store a 
key for the document in a separate field in your Lucene index.  Then 
when you have hits from a search, you'll be able to use the DB key to 
retrieve the PDF blob from Oracle, then maybe save the it to a temp file 
and start Adobe Reader to display the doc(s).  Or something like that.

Steve

sudarshan angirash wrote:
> hi all
> 
> i have some PDF files stored in Oracle 9i as BLOB.
> now i want to search for a string in those pdf files using Lucene. then i
> want to show the selected PDF files which contains The String.
> 
> if you can give me any pointers about how to do it, then it will be a gr8
> help for me.
> 
> regards
> sudarshan



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org