You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Cescy <29...@qq.com> on 2011/02/03 18:30:24 UTC

Some Problem with Lucene in Java

Hi,


I am developing an advanced pdf search engine in java by using pdfbox and lucene. And I must display the context of each keyword in the user interface, but i cannot find a method to do so. Most of the methods provided are used to deal with documents with whole content in the specified field, and i just need the context of each keyword (i.e. some specified part of the contents in the specified field). 


Are there any ways to do so???


Thx.


Cescy

Re: Some Problem with Lucene in Java

Posted by Felipe Lobo <fe...@jusbrasil.com.br>.

If i understand you question right, you want do generate the snippet for the
result documents.
You can do something like the code below:

QueryScorer scorer = new QueryScorer(query);
Highlighter highlighter = new Highlighter(scorer);
highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer));
String text = document.getField(fieldName).stringValue();
TokenStream tokenStream = analyzer.tokenStream(fieldName, new
StringReader(text));
String snippet = highlighter.getBestFragments(tokenStream, text,
NUM_FRAGMENTS, TOKEN_DELIMITER);


2011/2/3 Cescy <29...@qq.com>

> Hi,
>
>
> I am developing an advanced pdf search engine in java by using pdfbox and
> lucene. And I must display the context of each keyword in the user
> interface, but i cannot find a method to do so. Most of the methods provided
> are used to deal with documents with whole content in the specified field,
> and i just need the context of each keyword (i.e. some specified part of the
> contents in the specified field).
>
>
> Are there any ways to do so???
>
>
> Thx.
>
>
> Cescy




-- 
Felipe Lobo
www.jusbrasil.com.br