You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Donna L Gresh <gr...@us.ibm.com> on 2007/08/15 19:21:42 UTC

Question about highlighting returning nothing

I'm working on refining my stopwords by looking at the highest scoring 
document returned for each search, and using the highlighter to show which 
terms were significant in choosing that document. This has been extremely 
helpful in improving my searches. I've noticed though that sometimes the 
highlighter returns nothing, even though there is a non-zero score for the 
match. Could someone explain why this is so? A recent discussion I found 
spoke about the usefulness of returning, say, the first bit of text when 
this happens, but there wasn't any discussion of *why* this occurs--
Thanks

Donna


Re: Question about highlighting returning nothing

Posted by Lukas Vlcek <lu...@gmail.com>.
Donna,

Now I understand what you are saying (seems that I had PBCAK as well ;-)

As for your last question: ...under what conditions would the highlighter
return nothing? Only if no terms matched?

I remember that I found that highlighter can return null or empty string in
different situations. I think it depends on the Analyzer used or something
like that...

BR
Lukas

On 8/16/07, Donna L Gresh <gr...@us.ibm.com> wrote:
>
> Actually I don't think I'm having trouble-- as I mentioned,
> my text is *not* stored, so to do highlighting I retrieve the
> text from the database, apply the appropriate analyzer,
> and do the highlighting. It seems to be working exactly as
> it should. My problem was that in a few cases, the document
> has been removed from the database (but not from the index)
> so when I queried the database using the identifier for the "best
> hit" from the index, nothing
> was being returned. Passing "nothing" to the highlighter
> resulted in, of course, nothing, so I was getting no highlighted
> text. Once I updated my index to be in synch with the database,
> I no longer had any empty returns from the highlighter.
>
> Donna L. Gresh
> Services Research, Mathematical Sciences Department
> IBM T.J. Watson Research Center
> (914) 945-2472
> http://www.research.ibm.com/people/g/donnagresh
> gresh@us.ibm.com
>
>
>
>
> "Lukas Vlcek" <lu...@gmail.com>
> 08/15/2007 03:49 PM
> Please respond to
> java-user@lucene.apache.org
>
>
> To
> java-user@lucene.apache.org
> cc
>
> Subject
> Re: Question about highlighting returning nothing
>
>
>
>
>
>
> Donna,
>
> I have been investigation highlighters in Lucene recently a bit. The
> humble
> experience I've learned so far is that highlighting is completely
> different
> task from indexing/searching tandem. This simple fact is not obvious to a
> lot of people. In your particular casue it would be helpful if you can
> post
> more technical details about your system settings. Not only it is
> important
> if the field to be highlighted is stored but also it is important if you
> allow for query rewrite and what king of queries you are using (Prefix,
> Wildcard ... etc).
>
> Just my 2 cents.
>
> Lukas
>
> On 8/15/07, Donna L Gresh <gr...@us.ibm.com> wrote:
> >
> > Well, in my case the highlighting was returning nothing because of (my
> > favorite acronym) PBCAK--
> >
> > I don't store the text in the index, so I have to retrieve it separately
> > (from a database) for the highlighting, and my database was not in sync
> > with the index, so in a few cases the document in the index had been
> > deleted from the database--thus a score, but no document text.
> >
> > But I guess my original question remains; under what conditions would
> the
> > highlighter return nothing? Only if no terms matched?
> >
> > Donna
> >
>
>

Re: Question about highlighting returning nothing

Posted by Donna L Gresh <gr...@us.ibm.com>.
Actually I don't think I'm having trouble-- as I mentioned,
my text is *not* stored, so to do highlighting I retrieve the
text from the database, apply the appropriate analyzer, 
and do the highlighting. It seems to be working exactly as
it should. My problem was that in a few cases, the document
has been removed from the database (but not from the index)
so when I queried the database using the identifier for the "best
hit" from the index, nothing
was being returned. Passing "nothing" to the highlighter 
resulted in, of course, nothing, so I was getting no highlighted
text. Once I updated my index to be in synch with the database,
I no longer had any empty returns from the highlighter.

Donna L. Gresh
Services Research, Mathematical Sciences Department
IBM T.J. Watson Research Center
(914) 945-2472
http://www.research.ibm.com/people/g/donnagresh
gresh@us.ibm.com




"Lukas Vlcek" <lu...@gmail.com> 
08/15/2007 03:49 PM
Please respond to
java-user@lucene.apache.org


To
java-user@lucene.apache.org
cc

Subject
Re: Question about highlighting returning nothing






Donna,

I have been investigation highlighters in Lucene recently a bit. The 
humble
experience I've learned so far is that highlighting is completely 
different
task from indexing/searching tandem. This simple fact is not obvious to a
lot of people. In your particular casue it would be helpful if you can 
post
more technical details about your system settings. Not only it is 
important
if the field to be highlighted is stored but also it is important if you
allow for query rewrite and what king of queries you are using (Prefix,
Wildcard ... etc).

Just my 2 cents.

Lukas

On 8/15/07, Donna L Gresh <gr...@us.ibm.com> wrote:
>
> Well, in my case the highlighting was returning nothing because of (my
> favorite acronym) PBCAK--
>
> I don't store the text in the index, so I have to retrieve it separately
> (from a database) for the highlighting, and my database was not in sync
> with the index, so in a few cases the document in the index had been
> deleted from the database--thus a score, but no document text.
>
> But I guess my original question remains; under what conditions would 
the
> highlighter return nothing? Only if no terms matched?
>
> Donna
>


Re: Question about highlighting returning nothing

Posted by Lukas Vlcek <lu...@gmail.com>.
Donna,

I have been investigation highlighters in Lucene recently a bit. The humble
experience I've learned so far is that highlighting is completely different
task from indexing/searching tandem. This simple fact is not obvious to a
lot of people. In your particular casue it would be helpful if you can post
more technical details about your system settings. Not only it is important
if the field to be highlighted is stored but also it is important if you
allow for query rewrite and what king of queries you are using (Prefix,
Wildcard ... etc).

Just my 2 cents.

Lukas

On 8/15/07, Donna L Gresh <gr...@us.ibm.com> wrote:
>
> Well, in my case the highlighting was returning nothing because of (my
> favorite acronym) PBCAK--
>
> I don't store the text in the index, so I have to retrieve it separately
> (from a database) for the highlighting, and my database was not in sync
> with the index, so in a few cases the document in the index had been
> deleted from the database--thus a score, but no document text.
>
> But I guess my original question remains; under what conditions would the
> highlighter return nothing? Only if no terms matched?
>
> Donna
>

Re: Question about highlighting returning nothing

Posted by Donna L Gresh <gr...@us.ibm.com>.
Well, in my case the highlighting was returning nothing because of (my 
favorite acronym) PBCAK--

I don't store the text in the index, so I have to retrieve it separately 
(from a database) for the highlighting, and my database was not in sync 
with the index, so in a few cases the document in the index had been 
deleted from the database--thus a score, but no document text.

But I guess my original question remains; under what conditions would the 
highlighter return nothing? Only if no terms matched?

Donna