You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Dan Loewenherz <so...@dlo.me> on 2011/01/11 00:44:58 UTC

Solr highlighting is botching output

Hi all,

I'm implementing Solr for a course and book search service for college
students, and I'm running into some issues with the highlighting plugin.
After a few minutes of tinkering, searching on Google, searching the group
archives and not finding anything, I thought I would see if anyone else is
having this problem and if not what I am doing to cause it.

Basically, the issue is that whenever I turn on highlighting for a certain
field, I get either (1) inconsistent highlights or (2) bizarre highlight
output for some of the results. A few of the results look correct.

Here's my solrconfig.xml: http://pastie.org/private/iz3fd77innxb5r2v63zpa

Broken output: http://pastie.org/private/pyptpektckitp2piqvcgw

As you can see, I searched for "history". In the results, a few times that
the query is highlighted, you'll see that the name fields contain strings
such as "<span>History</span><span>History</span><span>History</span>
", instead of just highlighting it once.

I don't have the knowledge to understand why Solr would treat "African
American History: From Emancipation to the Present" differently than "African
American Women's History", other than one is longer than the other, or why
it would double or quadruple the highlighted response. I tried to figure out
what configuration option could change this, to no avail.

If anyone has any input, I would be very grateful. Thank you!

Dan