You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rallavagu <ra...@gmail.com> on 2015/10/28 21:06:48 UTC

Commit Error

Solr 4.6.1, cloud

Seeing following commit errors.

[commitScheduler-19-thread-1] ERROR org.apache.solr.update.CommitTracker 
– auto commit error...:java.lang.IllegalStateException: this writer hit 
an OutOfMemoryError; cannot commit at 
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2807) 
at 
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2984) at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:559) 
at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:440) 
at java.util.concurrent.FutureTask.run(FutureTask.java:138) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) 
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:896) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919) 
at java.lang.Thread.run(Thread.java:682)

Looking at the code,

public final void prepareCommit() throws IOException {
     ensureOpen();
     prepareCommitInternal();
   }

   private void prepareCommitInternal() throws IOException {
     synchronized(commitLock) {
       ensureOpen(false);
       if (infoStream.isEnabled("IW")) {
         infoStream.message("IW", "prepareCommit: flush");
         infoStream.message("IW", "  index before flush " + segString());
       }

       if (hitOOM) {
         throw new IllegalStateException("this writer hit an 
OutOfMemoryError; cannot commit");
       }

It simply checking a flag if it hit OOM? What is making to check and set 
the flag? What could be the conditions? Thanks.

Re: Commit Error

Posted by Rallavagu <ra...@gmail.com>.
Also, is this thread that went OOM and what could cause it? The heap was 
doing fine and server was live and running.

On 10/28/15 3:57 PM, Shawn Heisey wrote:
> On 10/28/2015 2:06 PM, Rallavagu wrote:
>> Solr 4.6.1, cloud
>>
>> Seeing following commit errors.
>>
>> [commitScheduler-19-thread-1] ERROR
>> org.apache.solr.update.CommitTracker – auto commit
>> error...:java.lang.IllegalStateException: this writer hit an
>> OutOfMemoryError; cannot commit at
>> org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2807)
>> at
>> org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2984)
>> at
>> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:559)
>> at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:440) at
>> java.util.concurrent.FutureTask.run(FutureTask.java:138) at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:896)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
>> at java.lang.Thread.run(Thread.java:682)
>>
>> Looking at the code,
>>
>> public final void prepareCommit() throws IOException {
>>      ensureOpen();
>>      prepareCommitInternal();
>>    }
>>
>>    private void prepareCommitInternal() throws IOException {
>>      synchronized(commitLock) {
>>        ensureOpen(false);
>>        if (infoStream.isEnabled("IW")) {
>>          infoStream.message("IW", "prepareCommit: flush");
>>          infoStream.message("IW", "  index before flush " + segString());
>>        }
>>
>>        if (hitOOM) {
>>          throw new IllegalStateException("this writer hit an
>> OutOfMemoryError; cannot commit");
>>        }
>>
>> It simply checking a flag if it hit OOM? What is making to check and
>> set the flag? What could be the conditions? Thanks.
>
> This exception handling was revamped in Lucene 4.10.1 (and therefore in
> Solr 4.10.1) by this issue:
>
> https://issues.apache.org/jira/browse/LUCENE-5958
>
> The "hitOOM" variable was removed by the following specific commit --
> this is the commit on the 4.10 branch, but it was also committed to
> branch_4x and trunk as well.  Later commits on this same issue were made
> to branch_5x -- the cutover to begin the 5.0 release process was made
> while this issue was still being fixed.
>
> https://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_10/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java?r1=1626189&r2=1626188&pathrev=1626189
>
> In the code before this fix, the hitOOM flag is set by other methods in
> IndexWriter.  It is volatile to prevent problems with multiple threads
> updating and accessing it.
>
> Your message doesn't indicate what problems you're having besides an
> error message in your log.  LUCENE-5958 indicates that the problems
> could be as bad as a corrupt index.
>
> The reason that IndexWriter swallows OOM exceptions is that this is the
> only way Lucene can even *attempt* to avoid index corruption in every
> error situation.  Lucene has had a very good track record at avoiding
> index corruption, but every now and then a bug is found and a user
> manages to get a corrupted index.
>
> Thanks,
> Shawn
>

Re: Commit Error

Posted by Rallavagu <ra...@gmail.com>.

On 10/28/15 5:41 PM, Shawn Heisey wrote:
> On 10/28/2015 5:11 PM, Rallavagu wrote:
>> Seeing very high CPU during this time and very high warmup times. During
>> this time, there were plenty of these errors logged. So, trying to find
>> out possible causes for this to occur. Could it be disk I/O issues or
>> something else as it is related to commit (writing to disk).
>
> Lucene is claiming that you're hitting the Out Of Memory exception.  I
> pulled down the 4.6.1 source code to verify IndexWriter's behavior.  The
> only time hitOOM can be set to true is when OutOfMemoryError is being
> thrown, so unless you're running Solr built from modified source code,
> Lucene's claim *is* what's happening.

This is very likely true as source is not modified.

>
> In OOM situations, there's a good chance that Java is going to be
> spending a lot of time doing garbage collection, which can cause CPU
> usage to go high and make warm times long.

Again, I think this is the likely case. Even though there is no apparent 
OOM, JVM can throw OOM in case of excessive number full GC and unable to 
claim certain amount of memory.

>
> The behavior of most Java programs is completely unpredictable when Java
> actually runs out of memory.  As already mentioned, the parts of Lucene
> that update the index are specifically programmed to deal with OOM
> without causing index corruption.  Writing code that is predictable in
> OOM situations is challenging, so only a subset of the code in
> Lucene/Solr has been hardened in this way.  Most of it is as
> unpredictable in OOM as any other Java program.

Thanks Shawn.

>
> Thanks,
> Shawn
>

Re: Commit Error

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/28/2015 5:11 PM, Rallavagu wrote:
> Seeing very high CPU during this time and very high warmup times. During
> this time, there were plenty of these errors logged. So, trying to find
> out possible causes for this to occur. Could it be disk I/O issues or
> something else as it is related to commit (writing to disk).

Lucene is claiming that you're hitting the Out Of Memory exception.  I
pulled down the 4.6.1 source code to verify IndexWriter's behavior.  The
only time hitOOM can be set to true is when OutOfMemoryError is being
thrown, so unless you're running Solr built from modified source code,
Lucene's claim *is* what's happening.

In OOM situations, there's a good chance that Java is going to be
spending a lot of time doing garbage collection, which can cause CPU
usage to go high and make warm times long.

The behavior of most Java programs is completely unpredictable when Java
actually runs out of memory.  As already mentioned, the parts of Lucene
that update the index are specifically programmed to deal with OOM
without causing index corruption.  Writing code that is predictable in
OOM situations is challenging, so only a subset of the code in
Lucene/Solr has been hardened in this way.  Most of it is as
unpredictable in OOM as any other Java program.

Thanks,
Shawn


Re: Commit Error

Posted by Rallavagu <ra...@gmail.com>.
Thanks Shawn for the response.

Seeing very high CPU during this time and very high warmup times. During 
this time, there were plenty of these errors logged. So, trying to find 
out possible causes for this to occur. Could it be disk I/O issues or 
something else as it is related to commit (writing to disk).

On 10/28/15 3:57 PM, Shawn Heisey wrote:
> On 10/28/2015 2:06 PM, Rallavagu wrote:
>> Solr 4.6.1, cloud
>>
>> Seeing following commit errors.
>>
>> [commitScheduler-19-thread-1] ERROR
>> org.apache.solr.update.CommitTracker – auto commit
>> error...:java.lang.IllegalStateException: this writer hit an
>> OutOfMemoryError; cannot commit at
>> org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2807)
>> at
>> org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2984)
>> at
>> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:559)
>> at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:440) at
>> java.util.concurrent.FutureTask.run(FutureTask.java:138) at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:896)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
>> at java.lang.Thread.run(Thread.java:682)
>>
>> Looking at the code,
>>
>> public final void prepareCommit() throws IOException {
>>      ensureOpen();
>>      prepareCommitInternal();
>>    }
>>
>>    private void prepareCommitInternal() throws IOException {
>>      synchronized(commitLock) {
>>        ensureOpen(false);
>>        if (infoStream.isEnabled("IW")) {
>>          infoStream.message("IW", "prepareCommit: flush");
>>          infoStream.message("IW", "  index before flush " + segString());
>>        }
>>
>>        if (hitOOM) {
>>          throw new IllegalStateException("this writer hit an
>> OutOfMemoryError; cannot commit");
>>        }
>>
>> It simply checking a flag if it hit OOM? What is making to check and
>> set the flag? What could be the conditions? Thanks.
>
> This exception handling was revamped in Lucene 4.10.1 (and therefore in
> Solr 4.10.1) by this issue:
>
> https://issues.apache.org/jira/browse/LUCENE-5958
>
> The "hitOOM" variable was removed by the following specific commit --
> this is the commit on the 4.10 branch, but it was also committed to
> branch_4x and trunk as well.  Later commits on this same issue were made
> to branch_5x -- the cutover to begin the 5.0 release process was made
> while this issue was still being fixed.
>
> https://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_10/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java?r1=1626189&r2=1626188&pathrev=1626189
>
> In the code before this fix, the hitOOM flag is set by other methods in
> IndexWriter.  It is volatile to prevent problems with multiple threads
> updating and accessing it.
>
> Your message doesn't indicate what problems you're having besides an
> error message in your log.  LUCENE-5958 indicates that the problems
> could be as bad as a corrupt index.
>
> The reason that IndexWriter swallows OOM exceptions is that this is the
> only way Lucene can even *attempt* to avoid index corruption in every
> error situation.  Lucene has had a very good track record at avoiding
> index corruption, but every now and then a bug is found and a user
> manages to get a corrupted index.
>
> Thanks,
> Shawn
>

Re: Commit Error

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/28/2015 2:06 PM, Rallavagu wrote:
> Solr 4.6.1, cloud
>
> Seeing following commit errors.
>
> [commitScheduler-19-thread-1] ERROR
> org.apache.solr.update.CommitTracker – auto commit
> error...:java.lang.IllegalStateException: this writer hit an
> OutOfMemoryError; cannot commit at
> org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2807)
> at
> org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2984)
> at
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:559)
> at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:440) at
> java.util.concurrent.FutureTask.run(FutureTask.java:138) at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:896)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
> at java.lang.Thread.run(Thread.java:682)
>
> Looking at the code,
>
> public final void prepareCommit() throws IOException {
>     ensureOpen();
>     prepareCommitInternal();
>   }
>
>   private void prepareCommitInternal() throws IOException {
>     synchronized(commitLock) {
>       ensureOpen(false);
>       if (infoStream.isEnabled("IW")) {
>         infoStream.message("IW", "prepareCommit: flush");
>         infoStream.message("IW", "  index before flush " + segString());
>       }
>
>       if (hitOOM) {
>         throw new IllegalStateException("this writer hit an
> OutOfMemoryError; cannot commit");
>       }
>
> It simply checking a flag if it hit OOM? What is making to check and
> set the flag? What could be the conditions? Thanks.

This exception handling was revamped in Lucene 4.10.1 (and therefore in
Solr 4.10.1) by this issue:

https://issues.apache.org/jira/browse/LUCENE-5958

The "hitOOM" variable was removed by the following specific commit --
this is the commit on the 4.10 branch, but it was also committed to
branch_4x and trunk as well.  Later commits on this same issue were made
to branch_5x -- the cutover to begin the 5.0 release process was made
while this issue was still being fixed.

https://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_10/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java?r1=1626189&r2=1626188&pathrev=1626189

In the code before this fix, the hitOOM flag is set by other methods in
IndexWriter.  It is volatile to prevent problems with multiple threads
updating and accessing it.

Your message doesn't indicate what problems you're having besides an
error message in your log.  LUCENE-5958 indicates that the problems
could be as bad as a corrupt index.

The reason that IndexWriter swallows OOM exceptions is that this is the
only way Lucene can even *attempt* to avoid index corruption in every
error situation.  Lucene has had a very good track record at avoiding
index corruption, but every now and then a bug is found and a user
manages to get a corrupted index.

Thanks,
Shawn