You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Eternal Security <ve...@wanadoo.fr> on 2005/01/17 06:15:59 UTC

How to get the "context" of the searched word ?

Hello all

When i search a word on documents, i want to get also the "context" of the word, for examples the 10 words before and after.
Like in search engine like google or yahoo.

So i have tried to use the Highlight modules, but Highlight need to get a String of the contents.

I use this function :

highlighter.highlightText(myString);

myString must to be the contents of the whole document !

But of course i don't want to store all the contents in the index files !

How can i do ?

I'm  sure i'm not the first lucene's user who want to get the "context" of the searched word.
I need to use Highlight or do you know a better way ?


Thanks in advance.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: How to get the "context" of the searched word ?

Posted by David Spencer <da...@tropo.com>.
Eternal Security wrote:

> Hello all
> 
> When i search a word on documents, i want to get also the "context" of the word, for examples the 10 words before and after.
> Like in search engine like google or yahoo.
> 
> So i have tried to use the Highlight modules, but Highlight need to get a String of the contents.
> 
> I use this function :
> 
> highlighter.highlightText(myString);
> 
> myString must to be the contents of the whole document !
> 
> But of course i don't want to store all the contents in the index files !
> 
> How can i do ?
> 
> I'm  sure i'm not the first lucene's user who want to get the "context" of the searched word.
> I need to use Highlight or do you know a better way ?

This is just the way it is - you can always compress the content. Only 
alternative is enabling term vectors for the body field - then - (um..I 
think) you don't need the entire content separately - instead is comes 
from the Lucene index.
To store the term vector the 3rd arg here will be true.
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/document/Field.html#Text(java.lang.String,%20java.lang.String,%20boolean)
I believe the index size will increase, but I haven't measured it.

> 
> 
> Thanks in advance.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org