You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jaspal Sawhney <js...@sapient.com> on 2016/08/17 02:53:06 UTC

Error During Indexing - org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: early EOF

Hello
We are running solr 4.6 in master-slave configuration where in our master is used entirely for indexing. No search traffic comes to master ever.
Off late we have started to get the early EOF error on the solr Master which results in a Broken Pipe error on the commerce application from where Indexing was kicked off from.

Things to mention

  1.  We have a couple of sites – each of which has the same document size but diff document count.
  2.  This error is being observed in the site which has the most number of document count I.e. 2204743
  3.  The way I have understood solr to work is that irrespective of number of document – the throughput is controlled by the ‘Number of Threads’ and ‘Batch size’ - Am I correct?
     *   In our case we have not touched the batch size and Number of Threads when this error started coming
     *   However when I do touch these parameters (specifically reduce them) the error does not come – however indexing time increases a lot.
  4.  We have to index overnight daily because we put product prices in the Index which get updated nightly
  5.  Solr master is running with a 20 GB Heap

What we have tried

  1.  I disabled autoCommit (I.e. Hard commit) and put the autoSoftCommit as 5 mins
     *   I realized afterwards that this was a wrong test because my understanding of soft commit was incorrect, My understanding now is that hard commit just truncate the Tlog do hardCommit should be better indexing performance.
     *   This test failed for lack of space reason however because disable autoCommit did not make sense – I did not retry this test yet.
  2.  Increased the RAMBufferSizeMB from 100MB to 1000MB
     *   This test did not yield anything favorable – the master gave the early EOF exception
  3.  Increased the merge factor from 20 —> 100
     *   This test did not yield anything favorable – the master gave the early EOF exception
  4.  Flipped the autoCommit to 15 secs and disabled auto commit
     *   This test did not yield anything favorable – the master gave the early EOF exception
     *   I got the input for this from https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ - Heavy (Bulk) Indexing section
  5.  Tried to bypass transaction log all together – This test is underway currently

Questions

  1.  Since we are not using solrCloud – I want to understand the impact of bypassing transaction log
  2.  How does solr take documents which are sent to it to storage as in what is the journey of a document from segment to tlog to storage.

It would be great If there are any pointers which you can share.

Thanks
J./

The actual Error Log
ERROR - 2016-08-16 22:59:55.988; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: early EOF
        at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
        at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
        at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:721)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
        at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
        at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
        at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
        at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
        at org.eclipse.jetty.server.Server.handle(Server.java:368)
        at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
        at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
        at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
        at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
        at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:953)
        at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
        at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
        at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
        at java.lang.Thread.run(Thread.java:745)
Caused by: com.ctc.wstx.exc.WstxIOException: early EOF
        at com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:708)
        at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086)
        at org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:389)
        at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
        at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)
        ... 32 more
Caused by: org.eclipse.jetty.io.EofException: early EOF
        at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:65)
        at java.io.InputStream.read(InputStream.java:101)
        at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:365)
        at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:110)
        at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101)
        at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84)
        at com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57)
        at com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:1046)
        at com.ctc.wstx.sr.StreamScanner.parseLocalName2(StreamScanner.java:1796)
        at com.ctc.wstx.sr.StreamScanner.parseLocalName(StreamScanner.java:1756)
        at com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:2981)
        at com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java:2936)
        at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2848)
        at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)
        ... 35 more


Re: Error During Indexing - org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: early EOF

Posted by Shawn Heisey <ap...@elyograg.org>.
On 8/16/2016 8:53 PM, Jaspal Sawhney wrote:
> We are running solr 4.6 in master-slave configuration where in our master is used entirely for indexing. No search traffic comes to master ever.
> Off late we have started to get the early EOF error on the solr Master which results in a Broken Pipe error on the commerce application from where Indexing was kicked off from.
<snip>
>   1.  Since we are not using solrCloud \u2013 I want to understand the impact of bypassing transaction log

The transaction log is required for SolrCloud, but it is highly
recommended for ANY Solr install.

Here's how to remove issues with the tlog directory growing out of
control:  Configure autoCommit with openSearcher set to false, no
maxDocs setting, and a maxTime of 60000 (one minute).  You'll commonly
see 15 seconds as a recommended maxTime -- my opinion is that this is
too frequent, but if you choose to use that, I doubt you'll have any
issues.  I think the newest versions of Solr do set up autoCommit like
this with a 15 second maxTime.

If you disable the transaction log and have some kind of crash during
indexing, you may lose documents.  When it is present, the transaction
log will be replayed when the core starts.

>   2.  How does solr take documents which are sent to it to storage as in what is the journey of a document from segment to tlog to storage.

Assuming no cloud mode, when a document arrives for indexing, it is
written to the tlog and sent to Lucene for processing.  When the Lucene
indexing buffer fills up, or a commit is issued, then the segment is
flushed.  Most of the time it will be flushed to disk, but if the
segment is very small and a soft commit is used, it may be flushed to
RAM instead -- this is a function of NRTCachingDirectoryFactory, which
is the default.

Cloud mode is slightly more complicated, but the behavior would be the
same once the document arrives at the correct core(s) that will index it.

The "early EOF" exception came from Jetty, not Solr.  Based on how EOF
is used by Jetty errors in other contexts, I think it means that the
indexing client closed the connection before all the data was sent,
which probably means that you have a low socket timeout on the client. 
The server likely paused while receiving the data, probably to handle
the data it had already received ... and the pause was longer than the
socket timeout, causing the client to close the connection.  Another
possibility is that the network is not working well, or that one of the
operating systems or software libraries involved has TCP or HTTP bugs.

I could be wrong about what the exception means, but the information I
was able to quickly locate supports the idea.

If I am right, then you will need to either reduce the amount of data
that you send in a single update request, or increase the socket timeout
that the indexing client is using on its connections.

Erick's idea of your update request exceeding the maximum POST body size
is something I hadn't thought of.  The default for this limit is 2MB,
and can be increased in solrconfig.xml.  I suspect that this isn't the
problem, but it's something to investigate.

Thanks,
Shawn


Re: Error During Indexing - org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: early EOF

Posted by Erick Erickson <er...@gmail.com>.
From my testing program, there's nothing standard here.

As the blog points out, since I was indexing fairly
simple documents you should _not_ be expecting to
see those indexing rates. The point of the article was
just to show the _relative_ changes when I sent
batches.

Best,
Erick

On Wed, Aug 17, 2016 at 1:59 PM, Jaspal Sawhney <js...@sapient.com> wrote:
> Erick
> Going through the article which you shared. Where are you getting the
> Docs/second value?
> Thanks
>
> On 8/17/16, 4:37 PM, "Jaspal Sawhney" <js...@sapient.com> wrote:
>
>>Erick
>>Thanks - My batch size was 30 and thread size also 30.
>>Thanks
>>
>>On 8/17/16, 3:48 PM, "Erick Erickson" <er...@gmail.com> wrote:
>>
>>>What this probably indicates is that the size of the packets you send
>>>to Solr is large enough that it exceeds the transport protocol's
>>>limit. This is reinforced by your statement that reducing the batch
>>>size fixes the problem even though it increases indexing time.
>>>
>>>So the place I'd be looking is the jetty configurations for any limits
>>>there.
>>>
>>>That said, what is your batch size? In my testing I pretty quickly get
>>>into diminishing returns, here's a writeup from some time ago:
>>>https://lucidworks.com/blog/2015/10/05/really-batch-updates-solr-2/
>>>
>>>Best,
>>>Erick
>>>
>>>On Wed, Aug 17, 2016 at 12:03 PM, Jaspal Sawhney <js...@sapient.com>
>>>wrote:
>>>> Bump !
>>>>
>>>> On 8/16/16, 10:53 PM, "Jaspal Sawhney" <js...@sapient.com> wrote:
>>>>
>>>>>Hello
>>>>>We are running solr 4.6 in master-slave configuration where in our
>>>>>master
>>>>>is used entirely for indexing. No search traffic comes to master ever.
>>>>>Off late we have started to get the early EOF error on the solr Master
>>>>>which results in a Broken Pipe error on the commerce application from
>>>>>where Indexing was kicked off from.
>>>>>
>>>>>Things to mention
>>>>>
>>>>>  1.  We have a couple of sites ­ each of which has the same document
>>>>>size but diff document count.
>>>>>  2.  This error is being observed in the site which has the most
>>>>>number
>>>>>of document count I.e. 2204743
>>>>>  3.  The way I have understood solr to work is that irrespective of
>>>>>number of document ­ the throughput is controlled by the ŒNumber of
>>>>>Threads¹ and ŒBatch size¹ - Am I correct?
>>>>>     *   In our case we have not touched the batch size and Number of
>>>>>Threads when this error started coming
>>>>>     *   However when I do touch these parameters (specifically reduce
>>>>>them) the error does not come ­ however indexing time increases a lot.
>>>>>  4.  We have to index overnight daily because we put product prices in
>>>>>the Index which get updated nightly
>>>>>  5.  Solr master is running with a 20 GB Heap
>>>>>
>>>>>What we have tried
>>>>>
>>>>>  1.  I disabled autoCommit (I.e. Hard commit) and put the
>>>>>autoSoftCommit
>>>>>as 5 mins
>>>>>     *   I realized afterwards that this was a wrong test because my
>>>>>understanding of soft commit was incorrect, My understanding now is
>>>>>that
>>>>>hard commit just truncate the Tlog do hardCommit should be better
>>>>>indexing performance.
>>>>>     *   This test failed for lack of space reason however because
>>>>>disable autoCommit did not make sense ­ I did not retry this test yet.
>>>>>  2.  Increased the RAMBufferSizeMB from 100MB to 1000MB
>>>>>     *   This test did not yield anything favorable ­ the master gave
>>>>>the
>>>>>early EOF exception
>>>>>  3.  Increased the merge factor from 20 ‹> 100
>>>>>     *   This test did not yield anything favorable ­ the master gave
>>>>>the
>>>>>early EOF exception
>>>>>  4.  Flipped the autoCommit to 15 secs and disabled auto commit
>>>>>     *   This test did not yield anything favorable ­ the master gave
>>>>>the
>>>>>early EOF exception
>>>>>     *   I got the input for this from
>>>>>https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-s
>>>>>o
>>>>>ft
>>>>>commit-and-commit-in-sorlcloud/ - Heavy (Bulk) Indexing section
>>>>>  5.  Tried to bypass transaction log all together ­ This test is
>>>>>underway currently
>>>>>
>>>>>Questions
>>>>>
>>>>>  1.  Since we are not using solrCloud ­ I want to understand the
>>>>>impact
>>>>>of bypassing transaction log
>>>>>  2.  How does solr take documents which are sent to it to storage as
>>>>>in
>>>>>what is the journey of a document from segment to tlog to storage.
>>>>>
>>>>>It would be great If there are any pointers which you can share.
>>>>>
>>>>>Thanks
>>>>>J./
>>>>>
>>>>>The actual Error Log
>>>>>ERROR - 2016-08-16 22:59:55.988; org.apache.solr.common.SolrException;
>>>>>org.apache.solr.common.SolrException: early EOF
>>>>>        at
>>>>>org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
>>>>>        at
>>>>>org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandle
>>>>>r
>>>>>.j
>>>>>ava:92)
>>>>>        at
>>>>>org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Cont
>>>>>e
>>>>>nt
>>>>>StreamHandlerBase.java:74)
>>>>>        at
>>>>>org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandler
>>>>>B
>>>>>as
>>>>>e.java:135)
>>>>>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
>>>>>        at
>>>>>org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.j
>>>>>a
>>>>>va
>>>>>:721)
>>>>>        at
>>>>>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.
>>>>>j
>>>>>av
>>>>>a:417)
>>>>>        at
>>>>>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.
>>>>>j
>>>>>av
>>>>>a:201)
>>>>>        at
>>>>>org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHa
>>>>>n
>>>>>dl
>>>>>er.java:1419)
>>>>>        at
>>>>>org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:4
>>>>>5
>>>>>5)
>>>>>        at
>>>>>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.jav
>>>>>a
>>>>>:1
>>>>>37)
>>>>>        at
>>>>>org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:
>>>>>5
>>>>>57
>>>>>)
>>>>>        at
>>>>>org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler
>>>>>.
>>>>>ja
>>>>>va:231)
>>>>>        at
>>>>>org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler
>>>>>.
>>>>>ja
>>>>>va:1075)
>>>>>        at
>>>>>org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:38
>>>>>4
>>>>>)
>>>>>        at
>>>>>org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.
>>>>>j
>>>>>av
>>>>>a:193)
>>>>>        at
>>>>>org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.
>>>>>j
>>>>>av
>>>>>a:1009)
>>>>>        at
>>>>>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.jav
>>>>>a
>>>>>:1
>>>>>35)
>>>>>        at
>>>>>org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(Contex
>>>>>t
>>>>>Ha
>>>>>ndlerCollection.java:255)
>>>>>        at
>>>>>org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollec
>>>>>t
>>>>>io
>>>>>n.java:154)
>>>>>        at
>>>>>org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.j
>>>>>a
>>>>>va
>>>>>:116)
>>>>>        at org.eclipse.jetty.server.Server.handle(Server.java:368)
>>>>>        at
>>>>>org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractH
>>>>>t
>>>>>tp
>>>>>Connection.java:489)
>>>>>        at
>>>>>org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingH
>>>>>t
>>>>>tp
>>>>>Connection.java:53)
>>>>>        at
>>>>>org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpCon
>>>>>n
>>>>>ec
>>>>>tion.java:953)
>>>>>        at
>>>>>org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(
>>>>>A
>>>>>bs
>>>>>tractHttpConnection.java:1014)
>>>>>        at
>>>>>org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:953)
>>>>>        at
>>>>>org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
>>>>>        at
>>>>>org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConn
>>>>>e
>>>>>ct
>>>>>ion.java:72)
>>>>>        at
>>>>>org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(Sock
>>>>>e
>>>>>tC
>>>>>onnector.java:264)
>>>>>        at
>>>>>org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.
>>>>>j
>>>>>av
>>>>>a:608)
>>>>>        at
>>>>>org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.j
>>>>>a
>>>>>va
>>>>>:543)
>>>>>        at java.lang.Thread.run(Thread.java:745)
>>>>>Caused by: com.ctc.wstx.exc.WstxIOException: early EOF
>>>>>        at
>>>>>com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:708)
>>>>>        at
>>>>>com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086)
>>>>>        at
>>>>>org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:389)
>>>>>        at
>>>>>org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:2
>>>>>4
>>>>>6)
>>>>>        at
>>>>>org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)
>>>>>        ... 32 more
>>>>>Caused by: org.eclipse.jetty.io.EofException: early EOF
>>>>>        at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:65)
>>>>>        at java.io.InputStream.read(InputStream.java:101)
>>>>>        at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:365)
>>>>>        at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:110)
>>>>>        at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101)
>>>>>        at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84)
>>>>>        at
>>>>>com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.ja
>>>>>v
>>>>>a:
>>>>>57)
>>>>>        at
>>>>>com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:10
>>>>>4
>>>>>6)
>>>>>        at
>>>>>com.ctc.wstx.sr.StreamScanner.parseLocalName2(StreamScanner.java:1796)
>>>>>        at
>>>>>com.ctc.wstx.sr.StreamScanner.parseLocalName(StreamScanner.java:1756)
>>>>>        at
>>>>>com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:
>>>>>2
>>>>>98
>>>>>1)
>>>>>        at
>>>>>com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.jav
>>>>>a
>>>>>:2
>>>>>936)
>>>>>        at
>>>>>com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2
>>>>>8
>>>>>48
>>>>>)
>>>>>        at
>>>>>com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)
>>>>>        ... 35 more
>>>>>
>>>>
>>
>

Re: Error During Indexing - org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: early EOF

Posted by Jaspal Sawhney <js...@sapient.com>.
Erick
Going through the article which you shared. Where are you getting the
Docs/second value?
Thanks

On 8/17/16, 4:37 PM, "Jaspal Sawhney" <js...@sapient.com> wrote:

>Erick
>Thanks - My batch size was 30 and thread size also 30.
>Thanks
>
>On 8/17/16, 3:48 PM, "Erick Erickson" <er...@gmail.com> wrote:
>
>>What this probably indicates is that the size of the packets you send
>>to Solr is large enough that it exceeds the transport protocol's
>>limit. This is reinforced by your statement that reducing the batch
>>size fixes the problem even though it increases indexing time.
>>
>>So the place I'd be looking is the jetty configurations for any limits
>>there.
>>
>>That said, what is your batch size? In my testing I pretty quickly get
>>into diminishing returns, here's a writeup from some time ago:
>>https://lucidworks.com/blog/2015/10/05/really-batch-updates-solr-2/
>>
>>Best,
>>Erick
>>
>>On Wed, Aug 17, 2016 at 12:03 PM, Jaspal Sawhney <js...@sapient.com>
>>wrote:
>>> Bump !
>>>
>>> On 8/16/16, 10:53 PM, "Jaspal Sawhney" <js...@sapient.com> wrote:
>>>
>>>>Hello
>>>>We are running solr 4.6 in master-slave configuration where in our
>>>>master
>>>>is used entirely for indexing. No search traffic comes to master ever.
>>>>Off late we have started to get the early EOF error on the solr Master
>>>>which results in a Broken Pipe error on the commerce application from
>>>>where Indexing was kicked off from.
>>>>
>>>>Things to mention
>>>>
>>>>  1.  We have a couple of sites ­ each of which has the same document
>>>>size but diff document count.
>>>>  2.  This error is being observed in the site which has the most
>>>>number
>>>>of document count I.e. 2204743
>>>>  3.  The way I have understood solr to work is that irrespective of
>>>>number of document ­ the throughput is controlled by the ŒNumber of
>>>>Threads¹ and ŒBatch size¹ - Am I correct?
>>>>     *   In our case we have not touched the batch size and Number of
>>>>Threads when this error started coming
>>>>     *   However when I do touch these parameters (specifically reduce
>>>>them) the error does not come ­ however indexing time increases a lot.
>>>>  4.  We have to index overnight daily because we put product prices in
>>>>the Index which get updated nightly
>>>>  5.  Solr master is running with a 20 GB Heap
>>>>
>>>>What we have tried
>>>>
>>>>  1.  I disabled autoCommit (I.e. Hard commit) and put the
>>>>autoSoftCommit
>>>>as 5 mins
>>>>     *   I realized afterwards that this was a wrong test because my
>>>>understanding of soft commit was incorrect, My understanding now is
>>>>that
>>>>hard commit just truncate the Tlog do hardCommit should be better
>>>>indexing performance.
>>>>     *   This test failed for lack of space reason however because
>>>>disable autoCommit did not make sense ­ I did not retry this test yet.
>>>>  2.  Increased the RAMBufferSizeMB from 100MB to 1000MB
>>>>     *   This test did not yield anything favorable ­ the master gave
>>>>the
>>>>early EOF exception
>>>>  3.  Increased the merge factor from 20 ‹> 100
>>>>     *   This test did not yield anything favorable ­ the master gave
>>>>the
>>>>early EOF exception
>>>>  4.  Flipped the autoCommit to 15 secs and disabled auto commit
>>>>     *   This test did not yield anything favorable ­ the master gave
>>>>the
>>>>early EOF exception
>>>>     *   I got the input for this from
>>>>https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-s
>>>>o
>>>>ft
>>>>commit-and-commit-in-sorlcloud/ - Heavy (Bulk) Indexing section
>>>>  5.  Tried to bypass transaction log all together ­ This test is
>>>>underway currently
>>>>
>>>>Questions
>>>>
>>>>  1.  Since we are not using solrCloud ­ I want to understand the
>>>>impact
>>>>of bypassing transaction log
>>>>  2.  How does solr take documents which are sent to it to storage as
>>>>in
>>>>what is the journey of a document from segment to tlog to storage.
>>>>
>>>>It would be great If there are any pointers which you can share.
>>>>
>>>>Thanks
>>>>J./
>>>>
>>>>The actual Error Log
>>>>ERROR - 2016-08-16 22:59:55.988; org.apache.solr.common.SolrException;
>>>>org.apache.solr.common.SolrException: early EOF
>>>>        at
>>>>org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
>>>>        at
>>>>org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandle
>>>>r
>>>>.j
>>>>ava:92)
>>>>        at
>>>>org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Cont
>>>>e
>>>>nt
>>>>StreamHandlerBase.java:74)
>>>>        at
>>>>org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandler
>>>>B
>>>>as
>>>>e.java:135)
>>>>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
>>>>        at
>>>>org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.j
>>>>a
>>>>va
>>>>:721)
>>>>        at
>>>>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.
>>>>j
>>>>av
>>>>a:417)
>>>>        at
>>>>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.
>>>>j
>>>>av
>>>>a:201)
>>>>        at
>>>>org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHa
>>>>n
>>>>dl
>>>>er.java:1419)
>>>>        at
>>>>org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:4
>>>>5
>>>>5)
>>>>        at
>>>>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.jav
>>>>a
>>>>:1
>>>>37)
>>>>        at
>>>>org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:
>>>>5
>>>>57
>>>>)
>>>>        at
>>>>org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler
>>>>.
>>>>ja
>>>>va:231)
>>>>        at
>>>>org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler
>>>>.
>>>>ja
>>>>va:1075)
>>>>        at
>>>>org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:38
>>>>4
>>>>)
>>>>        at
>>>>org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.
>>>>j
>>>>av
>>>>a:193)
>>>>        at
>>>>org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.
>>>>j
>>>>av
>>>>a:1009)
>>>>        at
>>>>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.jav
>>>>a
>>>>:1
>>>>35)
>>>>        at
>>>>org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(Contex
>>>>t
>>>>Ha
>>>>ndlerCollection.java:255)
>>>>        at
>>>>org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollec
>>>>t
>>>>io
>>>>n.java:154)
>>>>        at
>>>>org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.j
>>>>a
>>>>va
>>>>:116)
>>>>        at org.eclipse.jetty.server.Server.handle(Server.java:368)
>>>>        at
>>>>org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractH
>>>>t
>>>>tp
>>>>Connection.java:489)
>>>>        at
>>>>org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingH
>>>>t
>>>>tp
>>>>Connection.java:53)
>>>>        at
>>>>org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpCon
>>>>n
>>>>ec
>>>>tion.java:953)
>>>>        at
>>>>org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(
>>>>A
>>>>bs
>>>>tractHttpConnection.java:1014)
>>>>        at
>>>>org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:953)
>>>>        at
>>>>org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
>>>>        at
>>>>org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConn
>>>>e
>>>>ct
>>>>ion.java:72)
>>>>        at
>>>>org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(Sock
>>>>e
>>>>tC
>>>>onnector.java:264)
>>>>        at
>>>>org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.
>>>>j
>>>>av
>>>>a:608)
>>>>        at
>>>>org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.j
>>>>a
>>>>va
>>>>:543)
>>>>        at java.lang.Thread.run(Thread.java:745)
>>>>Caused by: com.ctc.wstx.exc.WstxIOException: early EOF
>>>>        at
>>>>com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:708)
>>>>        at
>>>>com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086)
>>>>        at
>>>>org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:389)
>>>>        at
>>>>org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:2
>>>>4
>>>>6)
>>>>        at
>>>>org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)
>>>>        ... 32 more
>>>>Caused by: org.eclipse.jetty.io.EofException: early EOF
>>>>        at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:65)
>>>>        at java.io.InputStream.read(InputStream.java:101)
>>>>        at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:365)
>>>>        at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:110)
>>>>        at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101)
>>>>        at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84)
>>>>        at
>>>>com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.ja
>>>>v
>>>>a:
>>>>57)
>>>>        at
>>>>com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:10
>>>>4
>>>>6)
>>>>        at
>>>>com.ctc.wstx.sr.StreamScanner.parseLocalName2(StreamScanner.java:1796)
>>>>        at
>>>>com.ctc.wstx.sr.StreamScanner.parseLocalName(StreamScanner.java:1756)
>>>>        at
>>>>com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:
>>>>2
>>>>98
>>>>1)
>>>>        at
>>>>com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.jav
>>>>a
>>>>:2
>>>>936)
>>>>        at
>>>>com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2
>>>>8
>>>>48
>>>>)
>>>>        at
>>>>com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)
>>>>        ... 35 more
>>>>
>>>
>


Re: Error During Indexing - org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: early EOF

Posted by Jaspal Sawhney <js...@sapient.com>.
Erick
Thanks - My batch size was 30 and thread size also 30.
Thanks

On 8/17/16, 3:48 PM, "Erick Erickson" <er...@gmail.com> wrote:

>What this probably indicates is that the size of the packets you send
>to Solr is large enough that it exceeds the transport protocol's
>limit. This is reinforced by your statement that reducing the batch
>size fixes the problem even though it increases indexing time.
>
>So the place I'd be looking is the jetty configurations for any limits
>there.
>
>That said, what is your batch size? In my testing I pretty quickly get
>into diminishing returns, here's a writeup from some time ago:
>https://lucidworks.com/blog/2015/10/05/really-batch-updates-solr-2/
>
>Best,
>Erick
>
>On Wed, Aug 17, 2016 at 12:03 PM, Jaspal Sawhney <js...@sapient.com>
>wrote:
>> Bump !
>>
>> On 8/16/16, 10:53 PM, "Jaspal Sawhney" <js...@sapient.com> wrote:
>>
>>>Hello
>>>We are running solr 4.6 in master-slave configuration where in our
>>>master
>>>is used entirely for indexing. No search traffic comes to master ever.
>>>Off late we have started to get the early EOF error on the solr Master
>>>which results in a Broken Pipe error on the commerce application from
>>>where Indexing was kicked off from.
>>>
>>>Things to mention
>>>
>>>  1.  We have a couple of sites ­ each of which has the same document
>>>size but diff document count.
>>>  2.  This error is being observed in the site which has the most number
>>>of document count I.e. 2204743
>>>  3.  The way I have understood solr to work is that irrespective of
>>>number of document ­ the throughput is controlled by the ŒNumber of
>>>Threads¹ and ŒBatch size¹ - Am I correct?
>>>     *   In our case we have not touched the batch size and Number of
>>>Threads when this error started coming
>>>     *   However when I do touch these parameters (specifically reduce
>>>them) the error does not come ­ however indexing time increases a lot.
>>>  4.  We have to index overnight daily because we put product prices in
>>>the Index which get updated nightly
>>>  5.  Solr master is running with a 20 GB Heap
>>>
>>>What we have tried
>>>
>>>  1.  I disabled autoCommit (I.e. Hard commit) and put the
>>>autoSoftCommit
>>>as 5 mins
>>>     *   I realized afterwards that this was a wrong test because my
>>>understanding of soft commit was incorrect, My understanding now is that
>>>hard commit just truncate the Tlog do hardCommit should be better
>>>indexing performance.
>>>     *   This test failed for lack of space reason however because
>>>disable autoCommit did not make sense ­ I did not retry this test yet.
>>>  2.  Increased the RAMBufferSizeMB from 100MB to 1000MB
>>>     *   This test did not yield anything favorable ­ the master gave
>>>the
>>>early EOF exception
>>>  3.  Increased the merge factor from 20 ‹> 100
>>>     *   This test did not yield anything favorable ­ the master gave
>>>the
>>>early EOF exception
>>>  4.  Flipped the autoCommit to 15 secs and disabled auto commit
>>>     *   This test did not yield anything favorable ­ the master gave
>>>the
>>>early EOF exception
>>>     *   I got the input for this from
>>>https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-so
>>>ft
>>>commit-and-commit-in-sorlcloud/ - Heavy (Bulk) Indexing section
>>>  5.  Tried to bypass transaction log all together ­ This test is
>>>underway currently
>>>
>>>Questions
>>>
>>>  1.  Since we are not using solrCloud ­ I want to understand the impact
>>>of bypassing transaction log
>>>  2.  How does solr take documents which are sent to it to storage as in
>>>what is the journey of a document from segment to tlog to storage.
>>>
>>>It would be great If there are any pointers which you can share.
>>>
>>>Thanks
>>>J./
>>>
>>>The actual Error Log
>>>ERROR - 2016-08-16 22:59:55.988; org.apache.solr.common.SolrException;
>>>org.apache.solr.common.SolrException: early EOF
>>>        at
>>>org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
>>>        at
>>>org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler
>>>.j
>>>ava:92)
>>>        at
>>>org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Conte
>>>nt
>>>StreamHandlerBase.java:74)
>>>        at
>>>org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerB
>>>as
>>>e.java:135)
>>>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
>>>        at
>>>org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.ja
>>>va
>>>:721)
>>>        at
>>>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j
>>>av
>>>a:417)
>>>        at
>>>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j
>>>av
>>>a:201)
>>>        at
>>>org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHan
>>>dl
>>>er.java:1419)
>>>        at
>>>org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:45
>>>5)
>>>        at
>>>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java
>>>:1
>>>37)
>>>        at
>>>org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:5
>>>57
>>>)
>>>        at
>>>org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.
>>>ja
>>>va:231)
>>>        at
>>>org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.
>>>ja
>>>va:1075)
>>>        at
>>>org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384
>>>)
>>>        at
>>>org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.j
>>>av
>>>a:193)
>>>        at
>>>org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.j
>>>av
>>>a:1009)
>>>        at
>>>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java
>>>:1
>>>35)
>>>        at
>>>org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(Context
>>>Ha
>>>ndlerCollection.java:255)
>>>        at
>>>org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollect
>>>io
>>>n.java:154)
>>>        at
>>>org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.ja
>>>va
>>>:116)
>>>        at org.eclipse.jetty.server.Server.handle(Server.java:368)
>>>        at
>>>org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHt
>>>tp
>>>Connection.java:489)
>>>        at
>>>org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHt
>>>tp
>>>Connection.java:53)
>>>        at
>>>org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConn
>>>ec
>>>tion.java:953)
>>>        at
>>>org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(A
>>>bs
>>>tractHttpConnection.java:1014)
>>>        at
>>>org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:953)
>>>        at
>>>org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
>>>        at
>>>org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConne
>>>ct
>>>ion.java:72)
>>>        at
>>>org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(Socke
>>>tC
>>>onnector.java:264)
>>>        at
>>>org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.j
>>>av
>>>a:608)
>>>        at
>>>org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.ja
>>>va
>>>:543)
>>>        at java.lang.Thread.run(Thread.java:745)
>>>Caused by: com.ctc.wstx.exc.WstxIOException: early EOF
>>>        at
>>>com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:708)
>>>        at
>>>com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086)
>>>        at
>>>org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:389)
>>>        at
>>>org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:24
>>>6)
>>>        at
>>>org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)
>>>        ... 32 more
>>>Caused by: org.eclipse.jetty.io.EofException: early EOF
>>>        at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:65)
>>>        at java.io.InputStream.read(InputStream.java:101)
>>>        at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:365)
>>>        at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:110)
>>>        at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101)
>>>        at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84)
>>>        at
>>>com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.jav
>>>a:
>>>57)
>>>        at
>>>com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:104
>>>6)
>>>        at
>>>com.ctc.wstx.sr.StreamScanner.parseLocalName2(StreamScanner.java:1796)
>>>        at
>>>com.ctc.wstx.sr.StreamScanner.parseLocalName(StreamScanner.java:1756)
>>>        at
>>>com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:2
>>>98
>>>1)
>>>        at
>>>com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java
>>>:2
>>>936)
>>>        at
>>>com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:28
>>>48
>>>)
>>>        at
>>>com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)
>>>        ... 35 more
>>>
>>


Re: Error During Indexing - org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: early EOF

Posted by Erick Erickson <er...@gmail.com>.
What this probably indicates is that the size of the packets you send
to Solr is large enough that it exceeds the transport protocol's
limit. This is reinforced by your statement that reducing the batch
size fixes the problem even though it increases indexing time.

So the place I'd be looking is the jetty configurations for any limits there.

That said, what is your batch size? In my testing I pretty quickly get
into diminishing returns, here's a writeup from some time ago:
https://lucidworks.com/blog/2015/10/05/really-batch-updates-solr-2/

Best,
Erick

On Wed, Aug 17, 2016 at 12:03 PM, Jaspal Sawhney <js...@sapient.com> wrote:
> Bump !
>
> On 8/16/16, 10:53 PM, "Jaspal Sawhney" <js...@sapient.com> wrote:
>
>>Hello
>>We are running solr 4.6 in master-slave configuration where in our master
>>is used entirely for indexing. No search traffic comes to master ever.
>>Off late we have started to get the early EOF error on the solr Master
>>which results in a Broken Pipe error on the commerce application from
>>where Indexing was kicked off from.
>>
>>Things to mention
>>
>>  1.  We have a couple of sites ­ each of which has the same document
>>size but diff document count.
>>  2.  This error is being observed in the site which has the most number
>>of document count I.e. 2204743
>>  3.  The way I have understood solr to work is that irrespective of
>>number of document ­ the throughput is controlled by the ŒNumber of
>>Threads¹ and ŒBatch size¹ - Am I correct?
>>     *   In our case we have not touched the batch size and Number of
>>Threads when this error started coming
>>     *   However when I do touch these parameters (specifically reduce
>>them) the error does not come ­ however indexing time increases a lot.
>>  4.  We have to index overnight daily because we put product prices in
>>the Index which get updated nightly
>>  5.  Solr master is running with a 20 GB Heap
>>
>>What we have tried
>>
>>  1.  I disabled autoCommit (I.e. Hard commit) and put the autoSoftCommit
>>as 5 mins
>>     *   I realized afterwards that this was a wrong test because my
>>understanding of soft commit was incorrect, My understanding now is that
>>hard commit just truncate the Tlog do hardCommit should be better
>>indexing performance.
>>     *   This test failed for lack of space reason however because
>>disable autoCommit did not make sense ­ I did not retry this test yet.
>>  2.  Increased the RAMBufferSizeMB from 100MB to 1000MB
>>     *   This test did not yield anything favorable ­ the master gave the
>>early EOF exception
>>  3.  Increased the merge factor from 20 ‹> 100
>>     *   This test did not yield anything favorable ­ the master gave the
>>early EOF exception
>>  4.  Flipped the autoCommit to 15 secs and disabled auto commit
>>     *   This test did not yield anything favorable ­ the master gave the
>>early EOF exception
>>     *   I got the input for this from
>>https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-soft
>>commit-and-commit-in-sorlcloud/ - Heavy (Bulk) Indexing section
>>  5.  Tried to bypass transaction log all together ­ This test is
>>underway currently
>>
>>Questions
>>
>>  1.  Since we are not using solrCloud ­ I want to understand the impact
>>of bypassing transaction log
>>  2.  How does solr take documents which are sent to it to storage as in
>>what is the journey of a document from segment to tlog to storage.
>>
>>It would be great If there are any pointers which you can share.
>>
>>Thanks
>>J./
>>
>>The actual Error Log
>>ERROR - 2016-08-16 22:59:55.988; org.apache.solr.common.SolrException;
>>org.apache.solr.common.SolrException: early EOF
>>        at
>>org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
>>        at
>>org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.j
>>ava:92)
>>        at
>>org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Content
>>StreamHandlerBase.java:74)
>>        at
>>org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBas
>>e.java:135)
>>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
>>        at
>>org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
>>:721)
>>        at
>>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
>>a:417)
>>        at
>>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
>>a:201)
>>        at
>>org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
>>er.java:1419)
>>        at
>>org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
>>        at
>>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:1
>>37)
>>        at
>>org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557
>>)
>>        at
>>org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.ja
>>va:231)
>>        at
>>org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.ja
>>va:1075)
>>        at
>>org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
>>        at
>>org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.jav
>>a:193)
>>        at
>>org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.jav
>>a:1009)
>>        at
>>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:1
>>35)
>>        at
>>org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHa
>>ndlerCollection.java:255)
>>        at
>>org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollectio
>>n.java:154)
>>        at
>>org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java
>>:116)
>>        at org.eclipse.jetty.server.Server.handle(Server.java:368)
>>        at
>>org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttp
>>Connection.java:489)
>>        at
>>org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttp
>>Connection.java:53)
>>        at
>>org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnec
>>tion.java:953)
>>        at
>>org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(Abs
>>tractHttpConnection.java:1014)
>>        at
>>org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:953)
>>        at
>>org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
>>        at
>>org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnect
>>ion.java:72)
>>        at
>>org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketC
>>onnector.java:264)
>>        at
>>org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.jav
>>a:608)
>>        at
>>org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java
>>:543)
>>        at java.lang.Thread.run(Thread.java:745)
>>Caused by: com.ctc.wstx.exc.WstxIOException: early EOF
>>        at
>>com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:708)
>>        at
>>com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086)
>>        at
>>org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:389)
>>        at
>>org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
>>        at
>>org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)
>>        ... 32 more
>>Caused by: org.eclipse.jetty.io.EofException: early EOF
>>        at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:65)
>>        at java.io.InputStream.read(InputStream.java:101)
>>        at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:365)
>>        at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:110)
>>        at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101)
>>        at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84)
>>        at
>>com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:
>>57)
>>        at
>>com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:1046)
>>        at
>>com.ctc.wstx.sr.StreamScanner.parseLocalName2(StreamScanner.java:1796)
>>        at
>>com.ctc.wstx.sr.StreamScanner.parseLocalName(StreamScanner.java:1756)
>>        at
>>com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:298
>>1)
>>        at
>>com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java:2
>>936)
>>        at
>>com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2848
>>)
>>        at
>>com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)
>>        ... 35 more
>>
>

Re: Error During Indexing - org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: early EOF

Posted by Jaspal Sawhney <js...@sapient.com>.
Bump !

On 8/16/16, 10:53 PM, "Jaspal Sawhney" <js...@sapient.com> wrote:

>Hello
>We are running solr 4.6 in master-slave configuration where in our master
>is used entirely for indexing. No search traffic comes to master ever.
>Off late we have started to get the early EOF error on the solr Master
>which results in a Broken Pipe error on the commerce application from
>where Indexing was kicked off from.
>
>Things to mention
>
>  1.  We have a couple of sites ­ each of which has the same document
>size but diff document count.
>  2.  This error is being observed in the site which has the most number
>of document count I.e. 2204743
>  3.  The way I have understood solr to work is that irrespective of
>number of document ­ the throughput is controlled by the ŒNumber of
>Threads¹ and ŒBatch size¹ - Am I correct?
>     *   In our case we have not touched the batch size and Number of
>Threads when this error started coming
>     *   However when I do touch these parameters (specifically reduce
>them) the error does not come ­ however indexing time increases a lot.
>  4.  We have to index overnight daily because we put product prices in
>the Index which get updated nightly
>  5.  Solr master is running with a 20 GB Heap
>
>What we have tried
>
>  1.  I disabled autoCommit (I.e. Hard commit) and put the autoSoftCommit
>as 5 mins
>     *   I realized afterwards that this was a wrong test because my
>understanding of soft commit was incorrect, My understanding now is that
>hard commit just truncate the Tlog do hardCommit should be better
>indexing performance.
>     *   This test failed for lack of space reason however because
>disable autoCommit did not make sense ­ I did not retry this test yet.
>  2.  Increased the RAMBufferSizeMB from 100MB to 1000MB
>     *   This test did not yield anything favorable ­ the master gave the
>early EOF exception
>  3.  Increased the merge factor from 20 ‹> 100
>     *   This test did not yield anything favorable ­ the master gave the
>early EOF exception
>  4.  Flipped the autoCommit to 15 secs and disabled auto commit
>     *   This test did not yield anything favorable ­ the master gave the
>early EOF exception
>     *   I got the input for this from
>https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-soft
>commit-and-commit-in-sorlcloud/ - Heavy (Bulk) Indexing section
>  5.  Tried to bypass transaction log all together ­ This test is
>underway currently
>
>Questions
>
>  1.  Since we are not using solrCloud ­ I want to understand the impact
>of bypassing transaction log
>  2.  How does solr take documents which are sent to it to storage as in
>what is the journey of a document from segment to tlog to storage.
>
>It would be great If there are any pointers which you can share.
>
>Thanks
>J./
>
>The actual Error Log
>ERROR - 2016-08-16 22:59:55.988; org.apache.solr.common.SolrException;
>org.apache.solr.common.SolrException: early EOF
>        at 
>org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
>        at 
>org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.j
>ava:92)
>        at 
>org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Content
>StreamHandlerBase.java:74)
>        at 
>org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBas
>e.java:135)
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
>        at 
>org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
>:721)
>        at 
>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
>a:417)
>        at 
>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
>a:201)
>        at 
>org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
>er.java:1419)
>        at 
>org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
>        at 
>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:1
>37)
>        at 
>org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557
>)
>        at 
>org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.ja
>va:231)
>        at 
>org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.ja
>va:1075)
>        at 
>org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
>        at 
>org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.jav
>a:193)
>        at 
>org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.jav
>a:1009)
>        at 
>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:1
>35)
>        at 
>org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHa
>ndlerCollection.java:255)
>        at 
>org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollectio
>n.java:154)
>        at 
>org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java
>:116)
>        at org.eclipse.jetty.server.Server.handle(Server.java:368)
>        at 
>org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttp
>Connection.java:489)
>        at 
>org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttp
>Connection.java:53)
>        at 
>org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnec
>tion.java:953)
>        at 
>org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(Abs
>tractHttpConnection.java:1014)
>        at 
>org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:953)
>        at 
>org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
>        at 
>org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnect
>ion.java:72)
>        at 
>org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketC
>onnector.java:264)
>        at 
>org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.jav
>a:608)
>        at 
>org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java
>:543)
>        at java.lang.Thread.run(Thread.java:745)
>Caused by: com.ctc.wstx.exc.WstxIOException: early EOF
>        at 
>com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:708)
>        at 
>com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086)
>        at 
>org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:389)
>        at 
>org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
>        at 
>org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)
>        ... 32 more
>Caused by: org.eclipse.jetty.io.EofException: early EOF
>        at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:65)
>        at java.io.InputStream.read(InputStream.java:101)
>        at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:365)
>        at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:110)
>        at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101)
>        at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84)
>        at 
>com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:
>57)
>        at 
>com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:1046)
>        at 
>com.ctc.wstx.sr.StreamScanner.parseLocalName2(StreamScanner.java:1796)
>        at 
>com.ctc.wstx.sr.StreamScanner.parseLocalName(StreamScanner.java:1756)
>        at 
>com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:298
>1)
>        at 
>com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java:2
>936)
>        at 
>com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2848
>)
>        at 
>com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)
>        ... 35 more
>