You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Darren Govoni <da...@ontrenet.com> on 2009/08/10 16:19:30 UTC

UTF-8 query support?

Hi,
  I tried to query my text field with a UTF-8 string that was in the
indexed document, but it returned nothing.

e.g.
http://192.168.2.10:8081/solr4/select/?q=%E5%BE%93%E6%9D%A5%E9%80%9A%E3%
82%8A&version=2.2&start=0&rows=10&indent=on

The result page showed a garbled query string (wrong encoding).
<str name="q">従来通り</str>

How do I set UTF-8 encoding so lucene can find the documents since it
supports UTF-8 queries?

thanks!
Darren



Re: UTF-8 query support?

Posted by Darren Govoni <da...@ontrenet.com>.
Thank you! I am using Tomcat and will give it a try.

On Mon, 2009-08-10 at 16:31 +0200, Mats Lindh wrote:
> On Mon, Aug 10, 2009 at 4:19 PM, Darren Govoni<da...@ontrenet.com> wrote:
> > How do I set UTF-8 encoding so lucene can find the documents since it
> > supports UTF-8 queries?
> 
> This depends on the app server you're using. I'm guessing Tomcat (as
> that's where I had the same issue), and you can fix this by enabling
> UTF-8 encoded query strings in Tomcat itself:
> 
> http://wiki.apache.org/solr/SolrTomcat#head-20147ee4d9dd5ca83ed264898280ab60457847c4
> 
> --mats


Re: UTF-8 query support?

Posted by Mats Lindh <ma...@gmail.com>.
On Mon, Aug 10, 2009 at 4:19 PM, Darren Govoni<da...@ontrenet.com> wrote:
> How do I set UTF-8 encoding so lucene can find the documents since it
> supports UTF-8 queries?

This depends on the app server you're using. I'm guessing Tomcat (as
that's where I had the same issue), and you can fix this by enabling
UTF-8 encoded query strings in Tomcat itself:

http://wiki.apache.org/solr/SolrTomcat#head-20147ee4d9dd5ca83ed264898280ab60457847c4

--mats

Re: UTF-8 query support?

Posted by Yonik Seeley <yo...@lucidimagination.com>.
Your URL suggests you set up your own servlet container - that's
probably the issue.
If you're using tomcat see http://wiki.apache.org/solr/SolrTomcat
Test out your config with example/exampledocs/test_utf8.sh

-Yonik
http://www.lucidimagination.com



On Mon, Aug 10, 2009 at 10:19 AM, Darren Govoni<da...@ontrenet.com> wrote:
> Hi,
>  I tried to query my text field with a UTF-8 string that was in the
> indexed document, but it returned nothing.
>
> e.g.
> http://192.168.2.10:8081/solr4/select/?q=%E5%BE%93%E6%9D%A5%E9%80%9A%E3%
> 82%8A&version=2.2&start=0&rows=10&indent=on
>
> The result page showed a garbled query string (wrong encoding).
> <str name="q">å¾“æ ¥é€šã‚Š</str>
>
> How do I set UTF-8 encoding so lucene can find the documents since it
> supports UTF-8 queries?
>
> thanks!
> Darren
>
>
>