You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by ma...@yahoo.co.uk on 2004/04/09 00:09:32 UTC

Highlighter package v2 RC1

I've reworked the highlighter package to address some issues (inability to pass fieldnames to analyzers,
limiting tokenization of large docs) and have refactored it to be more modular so that folks
can provide alternative implementations of the main functions (tokenizing, fragmenting and scoring) if required.

This is not backwards compatible with earlier releases but this new version should hopefully 
provide a much more robust framework going forward.
If people feel comfortable with this version I am happy to put this in the sandbox 
Any feedback is appreciated.

Code here:
http://www.inperspective.com/lucene/highlighter2/highlighter2.zip

Javadocs here:
http://www.inperspective.com/lucene/highlighter2/index.html

Quick code example:

  IndexSearcher searcher = new IndexSearcher(ramDir);
  Query query = QueryParser.parse("Kenne*", FIELD_NAME, analyzer);
  query=query.rewrite(reader); //required to expand search terms
  Hits hits = searcher.search(query);

  Highlighter highlighter =new Highlighter(new QueryScorer(query));
  for (int i = 0; i < hits.length(); i++)
  {
    String text = hits.doc(i).get(FIELD_NAME);
    TokenStream tokenStream=analyzer.tokenStream(FIELD_NAME,new StringReader(text));
    // Get 3 best fragments and seperate with a "..." 
    String result = highlighter.getBestFragments(tokenStream,text,3,"...");
    System.out.println(result);
  }


Cheers
Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Highlighter package v2 RC1

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

On Apr 8, 2004, at 8:19 PM, Stephane James Vaucher wrote:
> btw, is there a reason why it's called hilighter and not highlighter?

'cause I'm stupid?!  :/

I've just recommitted with the right name.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Highlighter package v2 RC1

Posted by Stephane James Vaucher <va...@cirano.qc.ca>.

Seems like a cool addition.

btw, is there a reason why it's called hilighter and not highlighter?

sv

On Thu, 8 Apr 2004, Erik Hatcher wrote:

> Mark,
> 
> I have committed the code in your .zip file to the Lucene sandbox under 
> contributions/hilighter.
> 
> I'm now going to post a vote over on the -dev list for you to become a 
> committer in that repository.
> 
> Many thanks for this awesome contribution!
> 
> 	Erik
> 
> 
> On Apr 8, 2004, at 6:09 PM, markharw00d@yahoo.co.uk wrote:
> 
> > I've reworked the highlighter package to address some issues 
> > (inability to pass fieldnames to analyzers,
> > limiting tokenization of large docs) and have refactored it to be more 
> > modular so that folks
> > can provide alternative implementations of the main functions 
> > (tokenizing, fragmenting and scoring) if required.
> >
> > This is not backwards compatible with earlier releases but this new 
> > version should hopefully
> > provide a much more robust framework going forward.
> > If people feel comfortable with this version I am happy to put this in 
> > the sandbox
> > Any feedback is appreciated.
> >
> > Code here:
> > http://www.inperspective.com/lucene/highlighter2/highlighter2.zip
> >
> > Javadocs here:
> > http://www.inperspective.com/lucene/highlighter2/index.html
> >
> > Quick code example:
> >
> >   IndexSearcher searcher = new IndexSearcher(ramDir);
> >   Query query = QueryParser.parse("Kenne*", FIELD_NAME, analyzer);
> >   query=query.rewrite(reader); //required to expand search terms
> >   Hits hits = searcher.search(query);
> >
> >   Highlighter highlighter =new Highlighter(new QueryScorer(query));
> >   for (int i = 0; i < hits.length(); i++)
> >   {
> >     String text = hits.doc(i).get(FIELD_NAME);
> >     TokenStream tokenStream=analyzer.tokenStream(FIELD_NAME,new 
> > StringReader(text));
> >     // Get 3 best fragments and seperate with a "..."
> >     String result = 
> > highlighter.getBestFragments(tokenStream,text,3,"...");
> >     System.out.println(result);
> >   }
> >
> >
> > Cheers
> > Mark
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Highlighter package v2 RC1

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

Mark,

I have committed the code in your .zip file to the Lucene sandbox under 
contributions/hilighter.

I'm now going to post a vote over on the -dev list for you to become a 
committer in that repository.

Many thanks for this awesome contribution!

	Erik


On Apr 8, 2004, at 6:09 PM, markharw00d@yahoo.co.uk wrote:

> I've reworked the highlighter package to address some issues 
> (inability to pass fieldnames to analyzers,
> limiting tokenization of large docs) and have refactored it to be more 
> modular so that folks
> can provide alternative implementations of the main functions 
> (tokenizing, fragmenting and scoring) if required.
>
> This is not backwards compatible with earlier releases but this new 
> version should hopefully
> provide a much more robust framework going forward.
> If people feel comfortable with this version I am happy to put this in 
> the sandbox
> Any feedback is appreciated.
>
> Code here:
> http://www.inperspective.com/lucene/highlighter2/highlighter2.zip
>
> Javadocs here:
> http://www.inperspective.com/lucene/highlighter2/index.html
>
> Quick code example:
>
>   IndexSearcher searcher = new IndexSearcher(ramDir);
>   Query query = QueryParser.parse("Kenne*", FIELD_NAME, analyzer);
>   query=query.rewrite(reader); //required to expand search terms
>   Hits hits = searcher.search(query);
>
>   Highlighter highlighter =new Highlighter(new QueryScorer(query));
>   for (int i = 0; i < hits.length(); i++)
>   {
>     String text = hits.doc(i).get(FIELD_NAME);
>     TokenStream tokenStream=analyzer.tokenStream(FIELD_NAME,new 
> StringReader(text));
>     // Get 3 best fragments and seperate with a "..."
>     String result = 
> highlighter.getBestFragments(tokenStream,text,3,"...");
>     System.out.println(result);
>   }
>
>
> Cheers
> Mark
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Highlighter package v2 RC1

Posted by Vladimir Yuryev <vy...@rambler.ru>.

Mark,

Many thanks for this news!

Vladimir.

On Thu, 8 Apr 2004 22:09:32 GMT
  markharw00d@yahoo.co.uk wrote:
>I've reworked the highlighter package to address some issues 
>(inability to pass fieldnames to analyzers,
>limiting tokenization of large docs) and have refactored it to be 
>more modular so that folks
>can provide alternative implementations of the main functions 
>(tokenizing, fragmenting and scoring) if required.
>
>This is not backwards compatible with earlier releases but this new 
>version should hopefully 
>provide a much more robust framework going forward.
>If people feel comfortable with this version I am happy to put this 
>in the sandbox 
>Any feedback is appreciated.
>
>Code here:
>http://www.inperspective.com/lucene/highlighter2/highlighter2.zip
>
>Javadocs here:
>http://www.inperspective.com/lucene/highlighter2/index.html
>
>Quick code example:
>
>  IndexSearcher searcher = new IndexSearcher(ramDir);
>  Query query = QueryParser.parse("Kenne*", FIELD_NAME, analyzer);
>  query=query.rewrite(reader); //required to expand search terms
>  Hits hits = searcher.search(query);
>
>  Highlighter highlighter =new Highlighter(new QueryScorer(query));
>  for (int i = 0; i < hits.length(); i++)
>  {
>    String text = hits.doc(i).get(FIELD_NAME);
>    TokenStream tokenStream=analyzer.tokenStream(FIELD_NAME,new 
>StringReader(text));
>    // Get 3 best fragments and seperate with a "..." 
>    String result = 
>highlighter.getBestFragments(tokenStream,text,3,"...");
>    System.out.println(result);
>  }
>
>
>Cheers
>Mark
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org