You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Mohan Cheema <mo...@arrkgroup.com> on 2018/05/02 09:13:18 UTC

Solr working £ Symbol

Hi There,

We are using Solr to index our data. The data contains £ symbol within the text and for currency. When data is exported from the source system data contains £ symbol, however, when the data is imported into the Solr £ symbol is converted to �.

How can we keep the £ symbol as is when importing data?

Note: When a file is viewed using less the pound symbol is displayed as <A3> and when viewed in vi editor it shows up properly.

Regards,

Mohan
Disclaimer: www.arrkgroup.com/EmailDisclaimer

RE: Solr working £ Symbol

Posted by Mohan Cheema <mo...@arrkgroup.com>.
>> We are using Solr to index our data. The data contains £ symbol within the text and for currency. When data is exported from the source system data contains £ symbol, however, when the data is imported into the Solr £ symbol is converted to  .
>>
> >How can we keep the £ symbol as is when importing data?
>
>What tools are you using to look at Solr results?  What tools are you using to send update data to Solr?
We have our application written in python which is using UTF-8 charset. We are using Solr post tool to send data to Solr.

>
>Solr expects and delivers UTF-8 characters.  If the data you're sending to Solr is using another character set, Java may not interpret it correctly.
The JSON file generated does show £ symbol. The post tool used IMHO will use system LANG setting which is set to ' LANG=en_GB.UTF-8'
>
>Conversely, if whatever you're using to look at Solr's results is also not expecting/displaying UTF-8, you might not be shown correct characters.
When we check the data using the Solr webapp there also we cannot see the £ symbol.

Regards,

Mohan
Disclaimer: www.arrkgroup.com/EmailDisclaimer

Re: Solr working £ Symbol

Posted by Shawn Heisey <ap...@elyograg.org>.
On 5/2/2018 3:13 AM, Mohan Cheema wrote:
> We are using Solr to index our data. The data contains £ symbol within the text and for currency. When data is exported from the source system data contains £ symbol, however, when the data is imported into the Solr £ symbol is converted to �.
>
> How can we keep the £ symbol as is when importing data?

What tools are you using to look at Solr results?  What tools are you
using to send update data to Solr?

Solr expects and delivers UTF-8 characters.  If the data you're sending
to Solr is using another character set, Java may not interpret it correctly.

Conversely, if whatever you're using to look at Solr's results is also
not expecting/displaying UTF-8, you might not be shown correct characters.

Thanks,
Shawn