You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Kevin Goess <kg...@bepress.com> on 2012/08/31 06:00:13 UTC

solr hanging regularly, thread dump attached, PDFOperator

If we have more than half a dozen processes posting updates to our Solr
instance (each about 1/sec), we get regular and fairly reproducible solr
hangs, requiring a kill -9 of the solr process. We're running Solr 3.4, a
fairly out-of-the-box solr/jetty setup.

I've got a thread dump and some GC logging information from the latest one.
 I haven't programmed in java for a decade, and any help decoding these
dumps would be hugely appreciated.

When I first was alerted that the process was hanging, this is the count of
what threads were doing what:

    33   java.lang.Thread.State: BLOCKED (on object monitor)
     16   java.lang.Thread.State: RUNNABLE
      2   java.lang.Thread.State: TIMED_WAITING (on object monitor)
      1   java.lang.Thread.State: WAITING (on object monitor)
     77   java.lang.Thread.State: WAITING (parking)

A couple minutes later it looked like this:

    29      java.lang.Thread.State: BLOCKED (on object monitor)
      8      java.lang.Thread.State: RUNNABLE
      1      java.lang.Thread.State: TIMED_WAITING (on object monitor)
      1      java.lang.Thread.State: TIMED_WAITING (parking)
      1      java.lang.Thread.State: TIMED_WAITING (sleeping)
      3      java.lang.Thread.State: WAITING (on object monitor)
      1      java.lang.Thread.State: WAITING (parking)

All the blocked threads looked like this:

java.lang.Thread.State: BLOCKED (on object monitor)
   at java.util.Collections$SynchronizedMap.get(Collections.java:1975)
   - waiting to lock <0x000000072551f908> (a
java.util.Collections$SynchronizedMap)
   at org.apache.pdfbox.util.PDFOperator.getOperator(PDFOperator.java:68)
   at
org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:441)
   at
org.apache.pdfbox.pdfparser.PDFStreamParser.access$000(PDFStreamParser.java:46)
   at
org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:175)
   at
org.apache.pdfbox.pdfparser.PDFStreamParser$1.hasNext(PDFStreamParser.java:187)
   at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:266)
   at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:251)
   at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:225)
   at
org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:442)
   at
org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:366)
   at
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:322)
   at
org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:242)
   at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:53)
   at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:90)
   at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
   at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
   at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:137)
   at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:213)
   ...

and yes, that 0x000000072551f908 address is the same for each thread that's
blocked.

In the first case where there were 77 threads waiting (parking),  they were
all

2012-08-30T20:23:55-0700     java.lang.Thread.State: WAITING (parking)
2012-08-30T20:23:55-0700        at sun.misc.Unsafe.park(Native Method)
2012-08-30T20:23:55-0700        - parking to wait for  <0x0000000726089700>
(a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2012-08-30T20:23:55-0700        at
java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
2012-08-30T20:23:55-0700        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
2012-08-30T20:23:55-0700        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:941)
2012-08-30T20:23:55-0700        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1261)
2012-08-30T20:23:55-0700        at
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:594)
2012-08-30T20:23:55-0700        at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:214)
2012-08-30T20:23:55-0700        at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
2012-08-30T20:23:55-0700        at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115)
2012-08-30T20:23:55-0700        at
org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:158)
2012-08-30T20:23:55-0700        at
org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79)
2012-08-30T20:23:55-0700        at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:67)
...

and yes, they're all waiting for 0x0000000726089700


A gist of the first thread dump, with most of the repetition redacted, is
here: https://gist.github.com/3548920

Any suggestions would be super helpful.  Thanks!



-- 
Kevin M. Goess
Software Engineer
Berkeley Electronic Press

Re: solr hanging regularly, thread dump attached, PDFOperator

Posted by Jack Krupansky <ja...@basetechnology.com>.
I've heard of a couple of situations where the parsing of multiple PDF files 
in parallel was problematic with lots of blocked threads. There may or may 
not be bugs in PDFBox, but one workaround is simply to only send one PDF at 
a time to Solr, or at least a smaller number in parallel.

I don't see this reported in any Solr Jira, so feel free to file as an 
issue. Hopefully you can narrow down exactly what the scenario is, and 
whether it depends on the specific document content.

OTOH, it may be simply be a memory/GC issue with having too many PDF files 
parsed simultaneously.

Are they large PDF files? Or just lots of them?

-- Jack Krupansky

-----Original Message----- 
From: Kevin Goess
Sent: Friday, August 31, 2012 12:00 AM
To: solr-user@lucene.apache.org
Subject: solr hanging regularly, thread dump attached, PDFOperator

If we have more than half a dozen processes posting updates to our Solr
instance (each about 1/sec), we get regular and fairly reproducible solr
hangs, requiring a kill -9 of the solr process. We're running Solr 3.4, a
fairly out-of-the-box solr/jetty setup.

I've got a thread dump and some GC logging information from the latest one.
I haven't programmed in java for a decade, and any help decoding these
dumps would be hugely appreciated.

When I first was alerted that the process was hanging, this is the count of
what threads were doing what:

    33   java.lang.Thread.State: BLOCKED (on object monitor)
     16   java.lang.Thread.State: RUNNABLE
      2   java.lang.Thread.State: TIMED_WAITING (on object monitor)
      1   java.lang.Thread.State: WAITING (on object monitor)
     77   java.lang.Thread.State: WAITING (parking)

A couple minutes later it looked like this:

    29      java.lang.Thread.State: BLOCKED (on object monitor)
      8      java.lang.Thread.State: RUNNABLE
      1      java.lang.Thread.State: TIMED_WAITING (on object monitor)
      1      java.lang.Thread.State: TIMED_WAITING (parking)
      1      java.lang.Thread.State: TIMED_WAITING (sleeping)
      3      java.lang.Thread.State: WAITING (on object monitor)
      1      java.lang.Thread.State: WAITING (parking)

All the blocked threads looked like this:

java.lang.Thread.State: BLOCKED (on object monitor)
   at java.util.Collections$SynchronizedMap.get(Collections.java:1975)
   - waiting to lock <0x000000072551f908> (a
java.util.Collections$SynchronizedMap)
   at org.apache.pdfbox.util.PDFOperator.getOperator(PDFOperator.java:68)
   at
org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:441)
   at
org.apache.pdfbox.pdfparser.PDFStreamParser.access$000(PDFStreamParser.java:46)
   at
org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:175)
   at
org.apache.pdfbox.pdfparser.PDFStreamParser$1.hasNext(PDFStreamParser.java:187)
   at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:266)
   at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:251)
   at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:225)
   at
org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:442)
   at
org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:366)
   at
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:322)
   at
org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:242)
   at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:53)
   at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:90)
   at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
   at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
   at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:137)
   at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:213)
   ...

and yes, that 0x000000072551f908 address is the same for each thread that's
blocked.

In the first case where there were 77 threads waiting (parking),  they were
all

2012-08-30T20:23:55-0700     java.lang.Thread.State: WAITING (parking)
2012-08-30T20:23:55-0700        at sun.misc.Unsafe.park(Native Method)
2012-08-30T20:23:55-0700        - parking to wait for  <0x0000000726089700>
(a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
2012-08-30T20:23:55-0700        at
java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
2012-08-30T20:23:55-0700        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
2012-08-30T20:23:55-0700        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:941)
2012-08-30T20:23:55-0700        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1261)
2012-08-30T20:23:55-0700        at
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:594)
2012-08-30T20:23:55-0700        at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:214)
2012-08-30T20:23:55-0700        at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
2012-08-30T20:23:55-0700        at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115)
2012-08-30T20:23:55-0700        at
org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:158)
2012-08-30T20:23:55-0700        at
org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79)
2012-08-30T20:23:55-0700        at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:67)
...

and yes, they're all waiting for 0x0000000726089700


A gist of the first thread dump, with most of the repetition redacted, is
here: https://gist.github.com/3548920

Any suggestions would be super helpful.  Thanks!



-- 
Kevin M. Goess
Software Engineer
Berkeley Electronic Press