You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Torsten Krah (Commented) (JIRA)" <ji...@apache.org> on 2011/12/19 14:45:31 UTC

[jira] [Commented] (PDFBOX-845) Lockup in PDDocument.load() --> PDFParser.parseObject() with 6 threads

    [ https://issues.apache.org/jira/browse/PDFBOX-845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172272#comment-13172272 ] 

Torsten Krah commented on PDFBOX-845:
-------------------------------------

Some update - i can confirm that this one does still happen with 1.6.0.
                
> Lockup in PDDocument.load() --> PDFParser.parseObject() with 6 threads
> ----------------------------------------------------------------------
>
>                 Key: PDFBOX-845
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-845
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.2.1
>         Environment: Linux lxdev01 2.6.32-24-generic #42-Ubuntu SMP Fri Aug 20 14:24:04 UTC 2010 i686 GNU/Linux
> java version "1.6.0_20"
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) Server VM (build 16.3-b01, mixed mode)
> testng 5.11
> running under maven-surefire plugin v2.5 with parallel=methods and threadCount=6
>            Reporter: Larry West
>            Priority: Critical
>         Attachments: pddocument-load-lockup.stack
>
>   Original Estimate: 32h
>  Remaining Estimate: 32h
>
> This is a TestNG unit test suite, with each test loading a different PDF via PDDocument.load().  I just switched TestNG to parallel=methods (had been serial) and it locked up first time.   "jstack -l" output will be attached, but I'm putting the pdfbox portions here (just below).
> In looking at the code, it's not clear what's being waited on, except that four of the threads are stuck in BaseParser.parseDirObject(), apparently waiting on the (synchronized) toString() method of a [local] StringBuffer.  (StringBuffer is used in BaseParser.java where a StringBuilder is clearly preferable -- this is probably true for every local instance of a StringBuffer.)
> I don't know why that would cause the thread to sit waiting, though (JVM problem? 1.6.0_20 on Linux), and the other two threads appear to be waiting on COSObjectKey.getNumber() [perhaps], and I see no synchronized objects or methods there.
> Finding the cause of the lockup would be preferable, but replacing StringBuffer with StringBuilder whereever they are used locally (possibly including any private non-static members) would be an improvement in performance if nothing else.
> Threads 1, 2, 4, & 6:
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:1013)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:157)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:233)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:929)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:519)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> Threads 3 & 5:
> 	at org.apache.pdfbox.cos.COSDocument.getObjectFromPool(COSDocument.java:481)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:540)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> I should add: these same tests, using the same files, have run literally hundreds of times on various machines (Windows, Mac, Linux) using PDFBox 1.2.1 without any lockup.
> Clearly I can back off parallelizing my tests for now, but there is no obvious reason why PDDocument.load() can't be called in parallel, and so it concerns me that this will be a real problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira