You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by lucuser4851 <lu...@log1.net> on 2005/02/17 08:37:38 UTC

Using the highlighter from the sandbox with a prefix query.

Dear All,
 We have been using the highlighter from the lucene sandbox, which works
very nicely most of the time. However when we try and use it with a
prefix query (which is what you get having parsed a wild-card query), it
doesn't return any highlighted sections. Has anyone else experienced
this problem, or found a way around it?

Thanks a lot for your suggestions!!



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Handling Synonyms

Posted by David Spencer <da...@tropo.com>.
Luke Shannon wrote:

> Hello;
> 
> Does anyone see a problem with the following approach?

No, no problem with it and it's in fact what my "Wordnet Query 
Expansion" sandbox module does.

The nice thing about Lucene is you at least have the option of doing 
things the other way - you can write a custom Analyzer that puts all 
synonyms at the same token offset so they appear to be in the same place 
in the token stream. Thinking about it...this approach, with the 
Analyzer, lets user search for phrases which would match a synonym, so, 
using your example below, the text "bright red engine" would be matched 
by either phrase "bright red" or "bright colour". Doing the query 
expansion is trickier if you allow phrases.

> 
> For synonyms, rather than putting them in the index, I put the original term
> and all the synonyms in the query.
> 
> Every time I create a query, I check if the term has any synonyms. If it
> does, I create Boolean Query OR'ing one Query object for each synonym.
> 
> So if I have a synoym list:
> 
> red = colour, primary, stop
> 
> And someone wants to search the desc field for the red, I would end up with
> something like:
> 
> ( (desc:*red*) (desc:*colout*) (desc:*stop*) ).

I don't like that bit about substring terms, but if it's right for you 
ok - if you insist on loosening things I'd consider fuzzy terms 
(desc:red~ ...etc).



> 
> Now the synonyms would'nt be in the index, the Query would account for all
> the possible synonym terms.
> 
> Luke
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Handling Synonyms

Posted by Luke Shannon <ls...@futurebrand.com>.
Hello;

Does anyone see a problem with the following approach?

For synonyms, rather than putting them in the index, I put the original term
and all the synonyms in the query.

Every time I create a query, I check if the term has any synonyms. If it
does, I create Boolean Query OR'ing one Query object for each synonym.

So if I have a synoym list:

red = colour, primary, stop

And someone wants to search the desc field for the red, I would end up with
something like:

( (desc:*red*) (desc:*colout*) (desc:*stop*) ).

Now the synonyms would'nt be in the index, the Query would account for all
the possible synonym terms.

Luke



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: Using the highlighter from the sandbox with a prefix query.

Posted by Michael Celona <mc...@criticalmention.com>.
Thank you.... this helped a lot...

Michael Celona

-----Original Message-----
From: Erik Hatcher [mailto:erik@ehatchersolutions.com] 
Sent: Monday, February 21, 2005 11:55 AM
To: Lucene Users List
Subject: Re: Using the highlighter from the sandbox with a prefix query.


On Feb 21, 2005, at 10:53 AM, Michael Celona wrote:

> That the only stack I get.  One thing to mention that I am using a
> MultiSearcher to rewrite the queries. I tried...
>
> query = searcher_last.rewrite( query );
> query = searcher_cur.rewrite( query );
>
> using IndexSearcher and I don't get an error... However, I not able to
> highlight wildcard queries.

I use Highlighter for lucenebook.com and have two indexes that I search 
with MultiSearcher.  Here's how I highlight:

         IndexReader reader = readers[indexIndex];
         QueryScorer scorer = new QueryScorer(query.rewrite(reader));
         SimpleHTMLFormatter formatter =
             new SimpleHTMLFormatter("<span class=\"highlight\">",
                 "</span>");
         Highlighter highlighter = new Highlighter(formatter, scorer);

I get the appropriate IndexReader for the document being highlighted.  
You can get the index _index_ this way:
'
         int indexIndex = searcher.subSearcher(hits.id(position));

Hope this helps.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Using the highlighter from the sandbox with a prefix query.

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Feb 21, 2005, at 10:53 AM, Michael Celona wrote:

> That the only stack I get.  One thing to mention that I am using a
> MultiSearcher to rewrite the queries. I tried...
>
> query = searcher_last.rewrite( query );
> query = searcher_cur.rewrite( query );
>
> using IndexSearcher and I don't get an error... However, I not able to
> highlight wildcard queries.

I use Highlighter for lucenebook.com and have two indexes that I search 
with MultiSearcher.  Here's how I highlight:

         IndexReader reader = readers[indexIndex];
         QueryScorer scorer = new QueryScorer(query.rewrite(reader));
         SimpleHTMLFormatter formatter =
             new SimpleHTMLFormatter("<span class=\"highlight\">",
                 "</span>");
         Highlighter highlighter = new Highlighter(formatter, scorer);

I get the appropriate IndexReader for the document being highlighted.  
You can get the index _index_ this way:
'
         int indexIndex = searcher.subSearcher(hits.id(position));

Hope this helps.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: Using the highlighter from the sandbox with a prefix query.

Posted by mark harwood <ma...@yahoo.co.uk>.
>One thing to mention
> that I am using a
> MultiSearcher to rewrite the queries. I tried...

Ah. I remember this got a little ugly. The highlighter
has a Junit test that demonstrates highlighting fuzzy
queries when using a multisearcher. Take a look at
that.

I can't remember the ins and outs of the issues but I
know the code there still runs clean with the latest
versions.

Cheers
Mark.



	
	
		
___________________________________________________________ 
ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: Using the highlighter from the sandbox with a prefix query.

Posted by Michael Celona <mc...@criticalmention.com>.
That the only stack I get.  One thing to mention that I am using a
MultiSearcher to rewrite the queries. I tried...

query = searcher_last.rewrite( query );
query = searcher_cur.rewrite( query );

using IndexSearcher and I don't get an error... However, I not able to
highlight wildcard queries.

Michael 

-----Original Message-----
From: Erik Hatcher [mailto:erik@ehatchersolutions.com] 
Sent: Monday, February 21, 2005 10:32 AM
To: Lucene Users List
Subject: Re: Using the highlighter from the sandbox with a prefix query.


On Feb 21, 2005, at 10:20 AM, Michael Celona wrote:

> I am using
> 	query = searcher.rewrite( query );
>
> and it is throwing java.lang.UnsupportedOperationException .
>
> Am I able to use the searcher rewrite method like this?

What's the full stack trace?

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Using the highlighter from the sandbox with a prefix query.

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Feb 21, 2005, at 10:20 AM, Michael Celona wrote:

> I am using
> 	query = searcher.rewrite( query );
>
> and it is throwing java.lang.UnsupportedOperationException .
>
> Am I able to use the searcher rewrite method like this?

What's the full stack trace?

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: Using the highlighter from the sandbox with a prefix query.

Posted by Michael Celona <mc...@criticalmention.com>.
I am using
	query = searcher.rewrite( query );

and it is throwing java.lang.UnsupportedOperationException .

Am I able to use the searcher rewrite method like this?

Thanks,
Michael

-----Original Message-----
From: Daniel Naber [mailto:daniel.naber@t-online.de] 
Sent: Thursday, February 17, 2005 4:09 AM
To: Lucene Users List
Subject: Re: Using the highlighter from the sandbox with a prefix query.

On Thursday 17 February 2005 08:37, lucuser4851 wrote:

>  We have been using the highlighter from the lucene sandbox, which works
> very nicely most of the time. However when we try and use it with a
> prefix query (which is what you get having parsed a wild-card query), it
> doesn't return any highlighted sections. Has anyone else experienced
> this problem, or found a way around it?

You need to call rewrite() on the query before you pass it to the
highlighter.

Regards
 Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Using the highlighter from the sandbox with a prefix query.

Posted by Daniel Naber <da...@t-online.de>.
On Thursday 17 February 2005 08:37, lucuser4851 wrote:

>  We have been using the highlighter from the lucene sandbox, which works
> very nicely most of the time. However when we try and use it with a
> prefix query (which is what you get having parsed a wild-card query), it
> doesn't return any highlighted sections. Has anyone else experienced
> this problem, or found a way around it?

You need to call rewrite() on the query before you pass it to the highlighter.

Regards
 Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Using the highlighter from the sandbox with a prefix query.

Posted by lucuser4851 <lu...@log1.net>.
Thanks very much Marc and Daniel. That solved the problem!!


On Thu, 2005-02-17 at 08:55 +0000, mark harwood wrote:
> See the highlighter's package.html for a description
> of how query.rewrite should be used to solve this.
> 
> Cheers,
> Mark
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Using the highlighter from the sandbox with a prefix query.

Posted by mark harwood <ma...@yahoo.co.uk>.
See the highlighter's package.html for a description
of how query.rewrite should be used to solve this.

Cheers,
Mark


 --- lucuser4851 <lu...@log1.net> wrote: 
> Dear All,
>  We have been using the highlighter from the lucene
> sandbox, which works
> very nicely most of the time. However when we try
> and use it with a
> prefix query (which is what you get having parsed a
> wild-card query), it
> doesn't return any highlighted sections. Has anyone
> else experienced
> this problem, or found a way around it?
> 
> Thanks a lot for your suggestions!!
> 
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> 
>  


	
	
		
___________________________________________________________ 
ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org