You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Gora Mohanty <go...@mimirtech.com> on 2012/01/31 10:06:48 UTC

Re: solr utf8 for words like compagnieën?

On Tue, Jan 31, 2012 at 1:50 PM, RT <rw...@gmail.com> wrote:
> Hi,
>
> I am having a bit of trouble getting words with characters such as:
>
> ė, į, ų etc into SOLR.
>
> Programming in C++ (Using Qt's QString) I am wondering what conversion to
> apply before compiling words with such letters into the solrquery.
>
> Is UTF8 the correct encoding?

UTF8 should be fine, though Latin1 will also work here.
How are you getting the UTF8 for these strings? Have
you looked at
http://developer.qt.nokia.com/doc/qt-4.8/QString.html#converting-between-8-bit-strings-and-unicode-strings

Regards,
Gora

Re: solr utf8 for words like compagnieën?

Posted by RT <rw...@gmail.com>.
Hi,

Both Latin1 and Utf8 conversion yield the same negative results.

I get compagnieën back from SOLR as:

compagnieën

I post with: toLatin1() and retrieve from SOLR into QString with 
QString::fromLatin1()

Rather dissapointing. Any ideas as to what I may be doing wrong are very 
welcome at this stage.

Thanks,

Roland.

Gora Mohanty wrote:
> On Tue, Jan 31, 2012 at 1:50 PM, RT <rw...@gmail.com> wrote:
>> Hi,
>>
>> I am having a bit of trouble getting words with characters such as:
>>
>> ė, į, ų etc into SOLR.
>>
>> Programming in C++ (Using Qt's QString) I am wondering what conversion to
>> apply before compiling words with such letters into the solrquery.
>>
>> Is UTF8 the correct encoding?
> 
> UTF8 should be fine, though Latin1 will also work here.
> How are you getting the UTF8 for these strings? Have
> you looked at
> http://developer.qt.nokia.com/doc/qt-4.8/QString.html#converting-between-8-bit-strings-and-unicode-strings
> 
> Regards,
> Gora
> 

Re: solr utf8 for words like compagnieën?

Posted by RT <rw...@gmail.com>.
Hi Gora,

thanks a lot for the below feedback. I use toLatin1() frequently and will 
opt for that to see what it does for me.

Thanks again.

Kind regards,

Roland

Gora Mohanty wrote:
> On Tue, Jan 31, 2012 at 1:50 PM, RT <rw...@gmail.com> wrote:
>> Hi,
>>
>> I am having a bit of trouble getting words with characters such as:
>>
>> ė, į, ų etc into SOLR.
>>
>> Programming in C++ (Using Qt's QString) I am wondering what conversion to
>> apply before compiling words with such letters into the solrquery.
>>
>> Is UTF8 the correct encoding?
> 
> UTF8 should be fine, though Latin1 will also work here.
> How are you getting the UTF8 for these strings? Have
> you looked at
> http://developer.qt.nokia.com/doc/qt-4.8/QString.html#converting-between-8-bit-strings-and-unicode-strings
> 
> Regards,
> Gora
>