You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Anton Feldmann <an...@uni-bielefeld.de> on 2006/04/27 17:04:47 UTC

lucene search sentence

Hi

I wrote a Indexer which is indexing all the contents of a text and the
sentence are seperated in an other Document.

"Document document = new Document(new Field ("contents", reader ));
            
        StringTokenizer token = new StringTokenizer(contents.replaceAll(". ", "\\.x\\") , "\\.x\\");
while(token.hasMoreTokens()){
       	Document doc = new Document();
      	doc.add(new Field ("sentence", token.nextToken(),Field.Store.YES, Field.Index.TOKENIZED) );
}"

1) How do I write a Lucene Search and display all the hits in an
document?
2) How do I display the sentence the hit is in? and color the hit.
3) How do I display the sentence before and after the sentence the hit
is in?

Cherrs

anton


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: lucene search sentence

Posted by Grant Ingersoll <gs...@syr.edu>.
Anton,

Please don't cross post "How do I..." questions to the dev list, it 
doesn't get you anywhere and just annoys those most likely to help you.

See below.

-Grant
Anton Feldmann wrote:
> Hi
>
> I wrote a Indexer which is indexing all the contents of a text and the
> sentence are seperated in an other Document.
>
> "Document document = new Document(new Field ("contents", reader ));
>             
>         StringTokenizer token = new StringTokenizer(contents.replaceAll(". ", "\\.x\\") , "\\.x\\");
> while(token.hasMoreTokens()){
>        	Document doc = new Document();
>       	doc.add(new Field ("sentence", token.nextToken(),Field.Store.YES, Field.Index.TOKENIZED) );
> }"
>
> 1) How do I write a Lucene Search and display all the hits in an
> document?
>   
SpanQuery can give you information about where matches take place.  If 
you are looking for a more basic answer, then refer to the demo on how 
to do a search that returns Hits or the well-written "Lucene In Action".

> 2) How do I display the sentence the hit is in? and color the hit.
>   
Use the Highlighter contrib package.

> 3) How do I display the sentence before and after the sentence the hit
> is in?
>   
Not sure.  You probably need some way of keeping track of where the 
sentences occur.  See my previous answer to a similar question you asked 
about how to index and search sentences.  I, personally, think you need 
to have a Document per sentence, with some metadata fields about where 
that sentence takes place, but others may have alternate ideas.  You 
_could_, instead of having each field be named "sentence", have the 
field name reflect which sentence it is, along with a catch all field, 
but this would make querying a lot harder.

> Cherrs
>
> anton
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>   

-- 

Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
335 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: lucene search sentence

Posted by Steven Rowe <sa...@syr.edu>.
Anton Feldmann wrote:
> 3) How do I display the sentence before and after the sentence the hit
> is in?

You could:

1. Make your Lucene Document be a set of three sentences (before, 
searchable, after), which you store, but write a custom Analyzer which 
only returns tokens for the "searchable" central sentence.

2. Store the full document contents outside of Lucene, and make your 
Lucene Document be a single sentence, the tokens from which you will 
index, but also include offset and length Fields for the previous and 
next sentences with the Document, corresponding to the windows from the 
full document that you want to display with the hit.  This one will 
likely work better with the Highlighter package.

Steve

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: lucene search sentence

Posted by Robert Engels <re...@ix.netcom.com>.
Ask the question on the lucene users list, not the dev-list.

And, Read a book. Read the javadoc. Read the samples.

-----Original Message-----
From: Anton Feldmann [mailto:anton.feldmann@uni-bielefeld.de]
Sent: Thursday, April 27, 2006 10:05 AM
To: java-dev@lucene.apache.org; java-user@lucene.apache.org
Subject: lucene search sentence


Hi

I wrote a Indexer which is indexing all the contents of a text and the
sentence are seperated in an other Document.

"Document document = new Document(new Field ("contents", reader ));

        StringTokenizer token = new StringTokenizer(contents.replaceAll(".
", "\\.x\\") , "\\.x\\");
while(token.hasMoreTokens()){
       	Document doc = new Document();
      	doc.add(new Field ("sentence", token.nextToken(),Field.Store.YES,
Field.Index.TOKENIZED) );
}"

1) How do I write a Lucene Search and display all the hits in an
document?
2) How do I display the sentence the hit is in? and color the hit.
3) How do I display the sentence before and after the sentence the hit
is in?

Cherrs

anton


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org