You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by getagrip <ge...@web.de> on 2010/12/07 17:12:58 UTC

highlighting encoding issue

Hi,
when I query solr (trunk) I get "numeric character references" instead 
of regular UTF-8 strings in case of special characters in the 
highlighting section, in the result section the characters are presented 
fine.

e.g instead of the German Umlaut Ä I get &#228;

Example:

<arr name="attr_content">
<str>
Vielfachmessgerät
</str>
</arr>

<lst name="highlighting">
<arr name="attr_content">
<str>
<em>Vielfachmessger&#228;t</em>
</str>

Any hints are welcome.

Re: highlighting encoding issue

Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
(10/12/08 1:12), getagrip wrote:
> Hi,
> when I query solr (trunk) I get "numeric character references" instead of regular UTF-8 strings in
> case of special characters in the highlighting section, in the result section the characters are
> presented fine.
>
> e.g instead of the German Umlaut Ä I get &#228;
>
> Example:
>
> <arr name="attr_content">
> <str>
> Vielfachmessgerät
> </str>
> </arr>
>
> <lst name="highlighting">
> <arr name="attr_content">
> <str>
> <em>Vielfachmessger&#228;t</em>
> </str>
>
> Any hints are welcome.

It may be due to HtmlEncoder in solrconfig.xml:

<!-- Configure the standard encoder -->
<encoder name="html" class="org.apache.solr.highlight.HtmlEncoder" default="true"/>

Try to remove the setting or use DefaultEncoder (it is "null" encoder)
instead of HtmlEncoder.

Koji
-- 
http://www.rondhuit.com/en/