You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by brian4 <bq...@gmail.com> on 2014/11/24 19:52:57 UTC

Indexing with SolrJ fails on windows

I am using solrj to index to Solr through a Java application - I've tried
this both with Solr 4.8.1 and Solr 4.10.2, indexing to Solr 4.10.0.

I've found I cannot index too large content (a field with 400 words) or more
than 1 document at once to Solr instances from windows.  The exact same
indexing code works from linux, unchanged.  

I've found it does not work on Windows if:
1) I try to add more than 1 document at a time
2) I try to add a document with a long field value (400 words).

However in both cases it works fine if run from linux, or on windows if I
only add one document without very long values for any field.

The exception I get is the following:
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
Invalid chunk header
	at
org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552)
~[solr-solrj-4.10.2.jar:4.10.2 1634293 - mike - 2014-10-26 05:56:22]
	at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
~[solr-solrj-4.10.2.jar:4.10.2 1634293 - mike - 2014-10-26 05:56:22]
	at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
~[solr-solrj-4.10.2.jar:4.10.2 1634293 - mike - 2014-10-26 05:56:22]
	at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
~[solr-solrj-4.10.2.jar:4.10.2 1634293 - mike - 2014-10-26 05:56:22]
	at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
~[solr-solrj-4.10.2.jar:4.10.2 1634293 - mike - 2014-10-26 05:56:22]
	at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
~[solr-solrj-4.10.2.jar:4.10.2 1634293 - mike - 2014-10-26 05:56:22]


My test code to reproduce this is the following:
    @Test
    public void testWriteBigOther() throws Exception {
        SolrServer solrServer = new
HttpSolrServer("http://my-vm:8080/solr/test_copy");
        
        SolrInputDocument doc = new SolrInputDocument();
        doc.addField("asset_id","test_write_big");
        doc.addField("title", "test write big title"); 
        doc.addField("secondary_header", StringUtils.repeat("396"," ",400)); 
        
        List<SolrInputDocument> inputDocs = new
ArrayList<SolrInputDocument>();
        inputDocs.add(doc);
        
        solrServer.add(inputDocs);
        
        solrServer.commit();
        solrServer.shutdown();
    }

(it uses org.apache.commons.lang3.StringUtils repeat() method to generate
the large field value).


It seems like there must be a bug in SolrJ - i.e., I guess when it is
building the request it does something differently in windows vs. linux -
like maybe adds a carriage return on windows?  Does anyone know how to fix
this, or what else I could do to diagnose it?




--
View this message in context: http://lucene.472066.n3.nabble.com/Indexing-with-SolrJ-fails-on-windows-tp4170687.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Indexing with SolrJ fails on windows

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
Try running the client with -Dline.separator='\n' to force the line
separator. https://docs.oracle.com/javase/tutorial/essential/environment/sysprop.html

However, if it that does work, it's probably a bug.

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 24 November 2014 at 14:00, brian4 <bq...@gmail.com> wrote:
> The problem seems to occur at the apache redirect - I found if I bypass
> apache by using my VM IP address directly as the Solr URL, then the error
> does not occur even from windows.
>
> From some searching it seems like Apache does not allow carriage returns in
> its request headers - so my guess is SolrJ is adding a carriage return in
> the request when run on Windows, but not on linux, so when receiving the
> request from Windows apache is spitting back an error.
>
> Is there any way to disable this behavior with SolrJ / have it generate
> consistent requests regardless of platform?
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Indexing-with-SolrJ-fails-on-windows-tp4170687p4170690.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Indexing with SolrJ fails on windows

Posted by brian4 <bq...@gmail.com>.
The problem seems to occur at the apache redirect - I found if I bypass
apache by using my VM IP address directly as the Solr URL, then the error
does not occur even from windows.

>From some searching it seems like Apache does not allow carriage returns in
its request headers - so my guess is SolrJ is adding a carriage return in
the request when run on Windows, but not on linux, so when receiving the
request from Windows apache is spitting back an error.

Is there any way to disable this behavior with SolrJ / have it generate
consistent requests regardless of platform?



--
View this message in context: http://lucene.472066.n3.nabble.com/Indexing-with-SolrJ-fails-on-windows-tp4170687p4170690.html
Sent from the Solr - User mailing list archive at Nabble.com.