You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Donna L Gresh <gr...@us.ibm.com> on 2007/08/15 19:21:42 UTC
Question about highlighting returning nothing
I'm working on refining my stopwords by looking at the highest scoring
document returned for each search, and using the highlighter to show which
terms were significant in choosing that document. This has been extremely
helpful in improving my searches. I've noticed though that sometimes the
highlighter returns nothing, even though there is a non-zero score for the
match. Could someone explain why this is so? A recent discussion I found
spoke about the usefulness of returning, say, the first bit of text when
this happens, but there wasn't any discussion of *why* this occurs--
Thanks
Donna
Re: Question about highlighting returning nothing
Posted by Lukas Vlcek <lu...@gmail.com>.
Donna,
Now I understand what you are saying (seems that I had PBCAK as well ;-)
As for your last question: ...under what conditions would the highlighter
return nothing? Only if no terms matched?
I remember that I found that highlighter can return null or empty string in
different situations. I think it depends on the Analyzer used or something
like that...
BR
Lukas
On 8/16/07, Donna L Gresh <gr...@us.ibm.com> wrote:
>
> Actually I don't think I'm having trouble-- as I mentioned,
> my text is *not* stored, so to do highlighting I retrieve the
> text from the database, apply the appropriate analyzer,
> and do the highlighting. It seems to be working exactly as
> it should. My problem was that in a few cases, the document
> has been removed from the database (but not from the index)
> so when I queried the database using the identifier for the "best
> hit" from the index, nothing
> was being returned. Passing "nothing" to the highlighter
> resulted in, of course, nothing, so I was getting no highlighted
> text. Once I updated my index to be in synch with the database,
> I no longer had any empty returns from the highlighter.
>
> Donna L. Gresh
> Services Research, Mathematical Sciences Department
> IBM T.J. Watson Research Center
> (914) 945-2472
> http://www.research.ibm.com/people/g/donnagresh
> gresh@us.ibm.com
>
>
>
>
> "Lukas Vlcek" <lu...@gmail.com>
> 08/15/2007 03:49 PM
> Please respond to
> java-user@lucene.apache.org
>
>
> To
> java-user@lucene.apache.org
> cc
>
> Subject
> Re: Question about highlighting returning nothing
>
>
>
>
>
>
> Donna,
>
> I have been investigation highlighters in Lucene recently a bit. The
> humble
> experience I've learned so far is that highlighting is completely
> different
> task from indexing/searching tandem. This simple fact is not obvious to a
> lot of people. In your particular casue it would be helpful if you can
> post
> more technical details about your system settings. Not only it is
> important
> if the field to be highlighted is stored but also it is important if you
> allow for query rewrite and what king of queries you are using (Prefix,
> Wildcard ... etc).
>
> Just my 2 cents.
>
> Lukas
>
> On 8/15/07, Donna L Gresh <gr...@us.ibm.com> wrote:
> >
> > Well, in my case the highlighting was returning nothing because of (my
> > favorite acronym) PBCAK--
> >
> > I don't store the text in the index, so I have to retrieve it separately
> > (from a database) for the highlighting, and my database was not in sync
> > with the index, so in a few cases the document in the index had been
> > deleted from the database--thus a score, but no document text.
> >
> > But I guess my original question remains; under what conditions would
> the
> > highlighter return nothing? Only if no terms matched?
> >
> > Donna
> >
>
>
Re: Question about highlighting returning nothing
Posted by Donna L Gresh <gr...@us.ibm.com>.
Actually I don't think I'm having trouble-- as I mentioned,
my text is *not* stored, so to do highlighting I retrieve the
text from the database, apply the appropriate analyzer,
and do the highlighting. It seems to be working exactly as
it should. My problem was that in a few cases, the document
has been removed from the database (but not from the index)
so when I queried the database using the identifier for the "best
hit" from the index, nothing
was being returned. Passing "nothing" to the highlighter
resulted in, of course, nothing, so I was getting no highlighted
text. Once I updated my index to be in synch with the database,
I no longer had any empty returns from the highlighter.
Donna L. Gresh
Services Research, Mathematical Sciences Department
IBM T.J. Watson Research Center
(914) 945-2472
http://www.research.ibm.com/people/g/donnagresh
gresh@us.ibm.com
"Lukas Vlcek" <lu...@gmail.com>
08/15/2007 03:49 PM
Please respond to
java-user@lucene.apache.org
To
java-user@lucene.apache.org
cc
Subject
Re: Question about highlighting returning nothing
Donna,
I have been investigation highlighters in Lucene recently a bit. The
humble
experience I've learned so far is that highlighting is completely
different
task from indexing/searching tandem. This simple fact is not obvious to a
lot of people. In your particular casue it would be helpful if you can
post
more technical details about your system settings. Not only it is
important
if the field to be highlighted is stored but also it is important if you
allow for query rewrite and what king of queries you are using (Prefix,
Wildcard ... etc).
Just my 2 cents.
Lukas
On 8/15/07, Donna L Gresh <gr...@us.ibm.com> wrote:
>
> Well, in my case the highlighting was returning nothing because of (my
> favorite acronym) PBCAK--
>
> I don't store the text in the index, so I have to retrieve it separately
> (from a database) for the highlighting, and my database was not in sync
> with the index, so in a few cases the document in the index had been
> deleted from the database--thus a score, but no document text.
>
> But I guess my original question remains; under what conditions would
the
> highlighter return nothing? Only if no terms matched?
>
> Donna
>
Re: Question about highlighting returning nothing
Posted by Lukas Vlcek <lu...@gmail.com>.
Donna,
I have been investigation highlighters in Lucene recently a bit. The humble
experience I've learned so far is that highlighting is completely different
task from indexing/searching tandem. This simple fact is not obvious to a
lot of people. In your particular casue it would be helpful if you can post
more technical details about your system settings. Not only it is important
if the field to be highlighted is stored but also it is important if you
allow for query rewrite and what king of queries you are using (Prefix,
Wildcard ... etc).
Just my 2 cents.
Lukas
On 8/15/07, Donna L Gresh <gr...@us.ibm.com> wrote:
>
> Well, in my case the highlighting was returning nothing because of (my
> favorite acronym) PBCAK--
>
> I don't store the text in the index, so I have to retrieve it separately
> (from a database) for the highlighting, and my database was not in sync
> with the index, so in a few cases the document in the index had been
> deleted from the database--thus a score, but no document text.
>
> But I guess my original question remains; under what conditions would the
> highlighter return nothing? Only if no terms matched?
>
> Donna
>
Re: Question about highlighting returning nothing
Posted by Donna L Gresh <gr...@us.ibm.com>.
Well, in my case the highlighting was returning nothing because of (my
favorite acronym) PBCAK--
I don't store the text in the index, so I have to retrieve it separately
(from a database) for the highlighting, and my database was not in sync
with the index, so in a few cases the document in the index had been
deleted from the database--thus a score, but no document text.
But I guess my original question remains; under what conditions would the
highlighter return nothing? Only if no terms matched?
Donna