You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Brian Lucas <bl...@gmail.com> on 2006/09/19 19:29:12 UTC
strange highlighting behavior
I’m experiencing some unusual behavior when I perform a search with highlighting enabled.
I’ve set up “id” as “sint” and indexed properly, but performing a search gives the following result:
<doc>
<float name="score">3.0647626</float>
<int name="group_id">2</int>
<int name="id">369845</int>
<int name="language_id">1</int>
<arr name="search_keywords">
<str>Microsoft Reorganizes</str>
</arr>
<str name="title">Microsoft Reorganizes</str>
</doc>
<doc>
<float name="score">3.0647626</float>
<int name="group_id">2</int>
<int name="id">369850</int>
<int name="language_id">1</int>
<arr name="search_keywords">
<str>Microsoft Moment</str>
</arr>
<str name="title">Microsoft Moment</str>
</doc>
…
<lst name="highlighting">
<lst name="€Zҵ">
<arr name="title">
<str><em>Microsoft</em> Reorganizes</str>
</arr>
</lst>
<lst name="€ZҺ">
<arr name="title">
<str><em>Microsoft</em> Moment</str>
</arr>
</lst>
<lst name="€#31;৳">
<arr name="title">
<str>NASCAR with <em>Microsoft</em></str>
</arr>
</lst>
</lst>
The unusual characters on lst name=”…” are what I can’t figure out, as it DEFINITELY is not the id. I’ve tried indexed id with “integer”, “sint”, and “string” all with the same result.
Using Solr-9-18 and Tomcat 5.5.17.
Anyway to see where it’s getting these strange names from? My understanding is that those should be the numeric ID’s given above.
Brian
Re: strange highlighting behavior
Posted by Yonik Seeley <yo...@apache.org>.
On 9/19/06, Brian Lucas <bl...@gmail.com> wrote:
> Converting to 'integer' and deleting/reindexing fixed it. Can 'sint' be
> used for the id with highlighting, or does one need to use integer or string
> for that?
It should be usable (but I personally haven't tested that).
If it's not, it's a bug and will be fixed :-)
> Just trying to figure out if it's a bug with sint, or possibly
> due to the fact I could have changed sint to integer without deleting the
> data.
The latter would be my guess.
-Yonik
RE: strange highlighting behavior
Posted by Brian Lucas <bl...@gmail.com>.
Yonik, thanks for the tip.
Converting to 'integer' and deleting/reindexing fixed it. Can 'sint' be
used for the id with highlighting, or does one need to use integer or string
for that? Just trying to figure out if it's a bug with sint, or possibly
due to the fact I could have changed sint to integer without deleting the
data.
-B
-----Original Message-----
From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik Seeley
Sent: Tuesday, September 19, 2006 11:55 AM
To: solr-user@lucene.apache.org
Subject: Re: strange highlighting behavior
On 9/19/06, Yonik Seeley <yo...@apache.org> wrote:
> The fix would be to use
> FieldType.indexedToReadable() to convert the indexed form back to a
> readable form.
Oops, that should be storedToReadable since the id is obtained from
the stored fields, not from the index.
Hmmm, a quick look at the code suggests this is already beeing done:
String printId = searcher.getSchema().printableUniqueKey(doc);
fragments.add(printId == null ? null : printId, docSummaries);
What you are seeing may be due to indexing documents with one version
of the schema and viewing them with another. Try deleting the
solr/data/index directory and then reindexing everything.
-Yonik
Re: strange highlighting behavior
Posted by Yonik Seeley <yo...@apache.org>.
On 9/19/06, Yonik Seeley <yo...@apache.org> wrote:
> The fix would be to use
> FieldType.indexedToReadable() to convert the indexed form back to a
> readable form.
Oops, that should be storedToReadable since the id is obtained from
the stored fields, not from the index.
Hmmm, a quick look at the code suggests this is already beeing done:
String printId = searcher.getSchema().printableUniqueKey(doc);
fragments.add(printId == null ? null : printId, docSummaries);
What you are seeing may be due to indexing documents with one version
of the schema and viewing them with another. Try deleting the
solr/data/index directory and then reindexing everything.
-Yonik
Re: strange highlighting behavior
Posted by Yonik Seeley <yo...@apache.org>.
On 9/19/06, Brian Lucas <bl...@gmail.com> wrote:
> The unusual characters on lst name="…" are what I can't figure out, as it DEFINITELY
> is not the id. I've tried indexed id with "integer", "sint", and "string" all with the
> same result.
Yes, looks like you hit a bug where you are seeing the "indexed" form
of sint (which is more of a binary format that allows terms to be
ordered in numeric order). The fix would be to use
FieldType.indexedToReadable() to convert the indexed form back to a
readable form.
It should have worked with "integer" or "string" since the indexed and
readable forms are identical... I suspect the old documents with an
sint ID still exist in your index and that is what you are seeing.
-Yonik