You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ugo Matrangolo <ug...@gmail.com> on 2014/02/05 19:37:51 UTC

UTF-8 encoding problems while replicating an index using SolrCloud

Hi,

we are having problems with an installation of SolrCloud where a leader
node kicks off an indexing and tries to replicate all the updates using the
UpdateHandler.

What we get instead is an error around a wrong UTF-8 encoding from the
leader trying to call the /udpate endpoint on the replica:

request:
http://10.40.0.25:9765/skus/update?update.chain=custom&_version_=-1459207589104451584&update.distrib=FROMLEADER&update.from=http%3A%2
F%2F10.40.0.24%3A9765%2Fskus%2F&wt=javabin&version=2
        at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:240)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

While on the replica we get this:

2014-02-05 14:00:00,226 [qtp-108] INFO
 org.apache.solr.update.processor.LogUpdateProcessor  - [skus] webapp=
path=/update
params={update.distrib=FROMLEADER&_version_=-1459207589104451584&update.from=
http://10.40.0.24:9765/skus/&wt=javabin&version=2&update.chain=custom<http://10.40.0.24:9765/gilt-by-sku/&wt=javabin&version=2&update.chain=custom>}
{} 0 71
2014-02-05 14:00:00,227 [qtp-108] ERROR org.apache.solr.core.SolrCore  -
org.apache.solr.common.SolrException: *Invalid UTF-8 middle byte 0xe0 (at
cha**r #1, byte #-1)*
        at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
        at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
        at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)

I have tried to sanitize all my docs making sure all the strings are in
UTF-8 but does not work.

Attached there is also the HTTP conversation that produces the error.

Would love to understand what is going on here :)

Thank you,
Ugo

Re: UTF-8 encoding problems while replicating an index using SolrCloud

Posted by David Santamauro <da...@gmail.com>.
I had that same error. I cleared it up by commenting out all the 
/update/xxx handlers and changing /update class to solr.UpdateRequestHandler

Hope that helps

David


On 02/05/2014 01:37 PM, Ugo Matrangolo wrote:
> Hi,
>
> we are having problems with an installation of SolrCloud where a leader
> node kicks off an indexing and tries to replicate all the updates using
> the UpdateHandler.
>
> What we get instead is an error around a wrong UTF-8 encoding from the
> leader trying to call the /udpate endpoint on the replica:
>
> request:
> http://10.40.0.25:9765/skus/update?update.chain=custom&_version_=-1459207589104451584&update.distrib=FROMLEADER&update.from=http%3A%2
> <http://10.40.0.25:9765/gilt-by-sku/update?update.chain=custom&_version_=-1459207589104451584&update.distrib=FROMLEADER&update.from=http%3A%2\>F%2F10.40.0.24%3A9765%2Fskus%2F&wt=javabin&version=2
>          at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:240)
>          at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>          at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>          at java.lang.Thread.run(Thread.java:662)
>
> While on the replica we get this:
>
> 2014-02-05 14:00:00,226 [qtp-108] INFO
>   org.apache.solr.update.processor.LogUpdateProcessor  - [skus] webapp=
> path=/update
> params={update.distrib=FROMLEADER&_version_=-1459207589104451584&update.from=http://10.40.0.24:9765/skus/&wt=javabin&version=2&update.chain=custom
> <http://10.40.0.24:9765/gilt-by-sku/&wt=javabin&version=2&update.chain=custom>}
> {} 0 71
> 2014-02-05 14:00:00,227 [qtp-108] ERROR org.apache.solr.core.SolrCore  -
> org.apache.solr.common.SolrException: *Invalid UTF-8 middle byte 0xe0
> (at cha**r #1, byte #-1)*
>          at
> org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
>          at
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
>          at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>
> I have tried to sanitize all my docs making sure all the strings are in
> UTF-8 but does not work.
>
> Attached there is also the HTTP conversation that produces the error.
>
> Would love to understand what is going on here :)
>
> Thank you,
> Ugo
>