You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sascha Szott <sz...@zib.de> on 2009/11/28 00:30:05 UTC

[Solved] Re: VelocityResponseWriter/Solritas character encoding issue

Hi Erik,

I've finally solved the problem. Unfortunately, the parameter 
v.contentType was not described in the Solr wiki (I've fixed that now). 
The point is, you must specify (in your solrconfig.xml)

        <str name="v.contentType">text/xml;charset=UTF-8</str>

in order to receive correctly UTF-8 encoded HTML. That's it!

Best,
Sascha

Erik Hatcher schrieb:
> Sascha,
> 
> Can you give me a test document that causes an issue?  (maybe send me a 
> Solr XML document in private e-mail).   I'll see what I can do once I 
> can see the issue first hand.
> 
>     Erik
> 
> 
> On Nov 18, 2009, at 2:48 PM, Sascha Szott wrote:
> 
>> Hi,
>>
>> I've played around with Solr's VelocityResponseWriter (which is indeed 
>> a very useful feature for rapid prototyping). I've realized that 
>> Velocity uses ISO-8859-1 as default character encoding. I've changed 
>> this setting to UTF-8 in my velocity.properties file (inside the conf 
>> directory), i.e.,
>>
>>   input.encoding=UTF-8
>>   output.encoding=UTF-8
>>
>> and checked that the settings were successfully loaded.
>>
>> Within the main Velocity template, browse.vm, the character encoding 
>> is set to UTF-8 as well, i.e.,
>>
>>   <meta http-equiv="content-type" content="text/html; charset=UTF-8"/>
>>
>> After starting Solr (which is deployed in a Tomcat 6 server on a 
>> Ubuntu machine), I ran into some character encoding problems.
>>
>> Due to the change of input.encoding to UTF-8, no problems occur when 
>> non-ASCII characters are presend in the query string, e.g. german 
>> umlauts. But unfortunately, something is wrong with the encoding of 
>> characters in the html page that is generated by 
>> VelocityResponseWriter. The non-ASCII characters aren't displayed 
>> properly (for example, FF prints a black diamond with a white question 
>> mark). If I manually set the encoding to ISO-8859-1, the non-ASCII 
>> characters are displayed correctly. Does anybody have a clue?
>>
>> Thanks in advance,
>> Sascha
>>
>>