You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Torsten Krah (Commented) (JIRA)" <ji...@apache.org> on 2011/12/19 14:45:31 UTC
[jira] [Commented] (PDFBOX-845) Lockup in PDDocument.load() -->
PDFParser.parseObject() with 6 threads
[ https://issues.apache.org/jira/browse/PDFBOX-845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172272#comment-13172272 ]
Torsten Krah commented on PDFBOX-845:
-------------------------------------
Some update - i can confirm that this one does still happen with 1.6.0.
> Lockup in PDDocument.load() --> PDFParser.parseObject() with 6 threads
> ----------------------------------------------------------------------
>
> Key: PDFBOX-845
> URL: https://issues.apache.org/jira/browse/PDFBOX-845
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 1.2.1
> Environment: Linux lxdev01 2.6.32-24-generic #42-Ubuntu SMP Fri Aug 20 14:24:04 UTC 2010 i686 GNU/Linux
> java version "1.6.0_20"
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) Server VM (build 16.3-b01, mixed mode)
> testng 5.11
> running under maven-surefire plugin v2.5 with parallel=methods and threadCount=6
> Reporter: Larry West
> Priority: Critical
> Attachments: pddocument-load-lockup.stack
>
> Original Estimate: 32h
> Remaining Estimate: 32h
>
> This is a TestNG unit test suite, with each test loading a different PDF via PDDocument.load(). I just switched TestNG to parallel=methods (had been serial) and it locked up first time. "jstack -l" output will be attached, but I'm putting the pdfbox portions here (just below).
> In looking at the code, it's not clear what's being waited on, except that four of the threads are stuck in BaseParser.parseDirObject(), apparently waiting on the (synchronized) toString() method of a [local] StringBuffer. (StringBuffer is used in BaseParser.java where a StringBuilder is clearly preferable -- this is probably true for every local instance of a StringBuffer.)
> I don't know why that would cause the thread to sit waiting, though (JVM problem? 1.6.0_20 on Linux), and the other two threads appear to be waiting on COSObjectKey.getNumber() [perhaps], and I see no synchronized objects or methods there.
> Finding the cause of the lockup would be preferable, but replacing StringBuffer with StringBuilder whereever they are used locally (possibly including any private non-static members) would be an improvement in performance if nothing else.
> Threads 1, 2, 4, & 6:
> at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:1013)
> at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:157)
> at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:233)
> at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:929)
> at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:519)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> Threads 3 & 5:
> at org.apache.pdfbox.cos.COSDocument.getObjectFromPool(COSDocument.java:481)
> at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:540)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> I should add: these same tests, using the same files, have run literally hundreds of times on various machines (Windows, Mac, Linux) using PDFBox 1.2.1 without any lockup.
> Clearly I can back off parallelizing my tests for now, but there is no obvious reason why PDDocument.load() can't be called in parallel, and so it concerns me that this will be a real problem.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira