You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Larry West (JIRA)" <ji...@apache.org> on 2010/09/29 21:49:40 UTC

[jira] Created: (PDFBOX-845) Lockup in PDDocument.load() --> PDFParser.parseObject() with 6 threads

Lockup in PDDocument.load() --> PDFParser.parseObject() with 6 threads
----------------------------------------------------------------------

                 Key: PDFBOX-845
                 URL: https://issues.apache.org/jira/browse/PDFBOX-845
             Project: PDFBox
          Issue Type: Bug
          Components: Parsing
    Affects Versions: 1.2.1
         Environment: Linux lxdev01 2.6.32-24-generic #42-Ubuntu SMP Fri Aug 20 14:24:04 UTC 2010 i686 GNU/Linux

java version "1.6.0_20"
Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
Java HotSpot(TM) Server VM (build 16.3-b01, mixed mode)

testng 5.11
running under maven-surefire plugin v2.5 with parallel=methods and threadCount=6
            Reporter: Larry West
            Priority: Critical


This is a TestNG unit test suite, with each test loading a different PDF via PDDocument.load().  I just switched TestNG to parallel=methods (had been serial) and it locked up first time.   "jstack -l" output will be attached, but I'm putting the pdfbox portions here (just below).

In looking at the code, it's not clear what's being waited on, except that four of the threads are stuck in BaseParser.parseDirObject(), apparently waiting on the (synchronized) toString() method of a [local] StringBuffer.  (StringBuffer is used in BaseParser.java where a StringBuilder is clearly preferable -- this is probably true for every local instance of a StringBuffer.)

I don't know why that would cause the thread to sit waiting, though (JVM problem? 1.6.0_20 on Linux), and the other two threads appear to be waiting on COSObjectKey.getNumber() [perhaps], and I see no synchronized objects or methods there.

Finding the cause of the lockup would be preferable, but replacing StringBuffer with StringBuilder whereever they are used locally (possibly including any private non-static members) would be an improvement in performance if nothing else.

Threads 1, 2, 4, & 6:
	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:1013)
	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:157)
	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:233)
	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:929)
	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:519)
	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)

Threads 3 & 5:
	at org.apache.pdfbox.cos.COSDocument.getObjectFromPool(COSDocument.java:481)
	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:540)
	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PDFBOX-845) Lockup in PDDocument.load() --> PDFParser.parseObject() with 6 threads

Posted by "Larry West (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PDFBOX-845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916554#action_12916554 ] 

Larry West commented on PDFBOX-845:
-----------------------------------

(Responding to Adam): 

My apologies, I worded that badly.   The same tests have run *serially* hundreds of times without problems on many Linux, Mac, Windows machines.   Now that I parallelized the tests, it is easy to reproduce the hang on all of those systems.  Roughly every 5-10 runs of the (parallelized) test suite it will hang. 

So if it is a JVM issue (still a possibility), it is Sun's 1.6.0_20 on Linux, Windows, and OSX.   (I can't bring myself to say "Oracle's JVM").

PS: yes, each of these threads is accessing its own file (in fact, they are all stuck inside the static PDDocument.load(File) method).

I'll try to get set up with 1.5 in the next week.

Sure wouldn't hurt performance to switch to StringBuilder, so I just might do that, time permitting.


> Lockup in PDDocument.load() --> PDFParser.parseObject() with 6 threads
> ----------------------------------------------------------------------
>
>                 Key: PDFBOX-845
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-845
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.2.1
>         Environment: Linux lxdev01 2.6.32-24-generic #42-Ubuntu SMP Fri Aug 20 14:24:04 UTC 2010 i686 GNU/Linux
> java version "1.6.0_20"
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) Server VM (build 16.3-b01, mixed mode)
> testng 5.11
> running under maven-surefire plugin v2.5 with parallel=methods and threadCount=6
>            Reporter: Larry West
>            Priority: Critical
>         Attachments: pddocument-load-lockup.stack
>
>   Original Estimate: 32h
>  Remaining Estimate: 32h
>
> This is a TestNG unit test suite, with each test loading a different PDF via PDDocument.load().  I just switched TestNG to parallel=methods (had been serial) and it locked up first time.   "jstack -l" output will be attached, but I'm putting the pdfbox portions here (just below).
> In looking at the code, it's not clear what's being waited on, except that four of the threads are stuck in BaseParser.parseDirObject(), apparently waiting on the (synchronized) toString() method of a [local] StringBuffer.  (StringBuffer is used in BaseParser.java where a StringBuilder is clearly preferable -- this is probably true for every local instance of a StringBuffer.)
> I don't know why that would cause the thread to sit waiting, though (JVM problem? 1.6.0_20 on Linux), and the other two threads appear to be waiting on COSObjectKey.getNumber() [perhaps], and I see no synchronized objects or methods there.
> Finding the cause of the lockup would be preferable, but replacing StringBuffer with StringBuilder whereever they are used locally (possibly including any private non-static members) would be an improvement in performance if nothing else.
> Threads 1, 2, 4, & 6:
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:1013)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:157)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:233)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:929)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:519)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> Threads 3 & 5:
> 	at org.apache.pdfbox.cos.COSDocument.getObjectFromPool(COSDocument.java:481)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:540)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> I should add: these same tests, using the same files, have run literally hundreds of times on various machines (Windows, Mac, Linux) using PDFBox 1.2.1 without any lockup.
> Clearly I can back off parallelizing my tests for now, but there is no obvious reason why PDDocument.load() can't be called in parallel, and so it concerns me that this will be a real problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PDFBOX-845) Lockup in PDDocument.load() --> PDFParser.parseObject() with 6 threads

Posted by "Larry West (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Larry West updated PDFBOX-845:
------------------------------

    Comment: was deleted

(was: Difficult to be sure, but they may be related.)

> Lockup in PDDocument.load() --> PDFParser.parseObject() with 6 threads
> ----------------------------------------------------------------------
>
>                 Key: PDFBOX-845
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-845
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.2.1
>         Environment: Linux lxdev01 2.6.32-24-generic #42-Ubuntu SMP Fri Aug 20 14:24:04 UTC 2010 i686 GNU/Linux
> java version "1.6.0_20"
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) Server VM (build 16.3-b01, mixed mode)
> testng 5.11
> running under maven-surefire plugin v2.5 with parallel=methods and threadCount=6
>            Reporter: Larry West
>            Priority: Critical
>         Attachments: pddocument-load-lockup.stack
>
>   Original Estimate: 32h
>  Remaining Estimate: 32h
>
> This is a TestNG unit test suite, with each test loading a different PDF via PDDocument.load().  I just switched TestNG to parallel=methods (had been serial) and it locked up first time.   "jstack -l" output will be attached, but I'm putting the pdfbox portions here (just below).
> In looking at the code, it's not clear what's being waited on, except that four of the threads are stuck in BaseParser.parseDirObject(), apparently waiting on the (synchronized) toString() method of a [local] StringBuffer.  (StringBuffer is used in BaseParser.java where a StringBuilder is clearly preferable -- this is probably true for every local instance of a StringBuffer.)
> I don't know why that would cause the thread to sit waiting, though (JVM problem? 1.6.0_20 on Linux), and the other two threads appear to be waiting on COSObjectKey.getNumber() [perhaps], and I see no synchronized objects or methods there.
> Finding the cause of the lockup would be preferable, but replacing StringBuffer with StringBuilder whereever they are used locally (possibly including any private non-static members) would be an improvement in performance if nothing else.
> Threads 1, 2, 4, & 6:
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:1013)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:157)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:233)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:929)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:519)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> Threads 3 & 5:
> 	at org.apache.pdfbox.cos.COSDocument.getObjectFromPool(COSDocument.java:481)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:540)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> I should add: these same tests, using the same files, have run literally hundreds of times on various machines (Windows, Mac, Linux) using PDFBox 1.2.1 without any lockup.
> Clearly I can back off parallelizing my tests for now, but there is no obvious reason why PDDocument.load() can't be called in parallel, and so it concerns me that this will be a real problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PDFBOX-845) Lockup in PDDocument.load() --> PDFParser.parseObject() with 6 threads

Posted by "Larry West (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Larry West updated PDFBOX-845:
------------------------------

    Description: 
This is a TestNG unit test suite, with each test loading a different PDF via PDDocument.load().  I just switched TestNG to parallel=methods (had been serial) and it locked up first time.   "jstack -l" output will be attached, but I'm putting the pdfbox portions here (just below).

In looking at the code, it's not clear what's being waited on, except that four of the threads are stuck in BaseParser.parseDirObject(), apparently waiting on the (synchronized) toString() method of a [local] StringBuffer.  (StringBuffer is used in BaseParser.java where a StringBuilder is clearly preferable -- this is probably true for every local instance of a StringBuffer.)

I don't know why that would cause the thread to sit waiting, though (JVM problem? 1.6.0_20 on Linux), and the other two threads appear to be waiting on COSObjectKey.getNumber() [perhaps], and I see no synchronized objects or methods there.

Finding the cause of the lockup would be preferable, but replacing StringBuffer with StringBuilder whereever they are used locally (possibly including any private non-static members) would be an improvement in performance if nothing else.

Threads 1, 2, 4, & 6:
	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:1013)
	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:157)
	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:233)
	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:929)
	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:519)
	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)

Threads 3 & 5:
	at org.apache.pdfbox.cos.COSDocument.getObjectFromPool(COSDocument.java:481)
	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:540)
	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)

I should add: these same tests, using the same files, have run literally hundreds of times on various machines (Windows, Mac, Linux) using PDFBox 1.2.1 without any lockup.

Clearly I can back off parallelizing my tests for now, but there is no obvious reason why PDDocument.load() can't be called in parallel, and so it concerns me that this will be a real problem.

  was:
This is a TestNG unit test suite, with each test loading a different PDF via PDDocument.load().  I just switched TestNG to parallel=methods (had been serial) and it locked up first time.   "jstack -l" output will be attached, but I'm putting the pdfbox portions here (just below).

In looking at the code, it's not clear what's being waited on, except that four of the threads are stuck in BaseParser.parseDirObject(), apparently waiting on the (synchronized) toString() method of a [local] StringBuffer.  (StringBuffer is used in BaseParser.java where a StringBuilder is clearly preferable -- this is probably true for every local instance of a StringBuffer.)

I don't know why that would cause the thread to sit waiting, though (JVM problem? 1.6.0_20 on Linux), and the other two threads appear to be waiting on COSObjectKey.getNumber() [perhaps], and I see no synchronized objects or methods there.

Finding the cause of the lockup would be preferable, but replacing StringBuffer with StringBuilder whereever they are used locally (possibly including any private non-static members) would be an improvement in performance if nothing else.

Threads 1, 2, 4, & 6:
	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:1013)
	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:157)
	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:233)
	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:929)
	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:519)
	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)

Threads 3 & 5:
	at org.apache.pdfbox.cos.COSDocument.getObjectFromPool(COSDocument.java:481)
	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:540)
	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)




May be related to PDFBOX-353 (titled "org.pdfbox.pdfparser.BaseParser.parseDirObject")

> Lockup in PDDocument.load() --> PDFParser.parseObject() with 6 threads
> ----------------------------------------------------------------------
>
>                 Key: PDFBOX-845
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-845
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.2.1
>         Environment: Linux lxdev01 2.6.32-24-generic #42-Ubuntu SMP Fri Aug 20 14:24:04 UTC 2010 i686 GNU/Linux
> java version "1.6.0_20"
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) Server VM (build 16.3-b01, mixed mode)
> testng 5.11
> running under maven-surefire plugin v2.5 with parallel=methods and threadCount=6
>            Reporter: Larry West
>            Priority: Critical
>         Attachments: pddocument-load-lockup.stack
>
>   Original Estimate: 32h
>  Remaining Estimate: 32h
>
> This is a TestNG unit test suite, with each test loading a different PDF via PDDocument.load().  I just switched TestNG to parallel=methods (had been serial) and it locked up first time.   "jstack -l" output will be attached, but I'm putting the pdfbox portions here (just below).
> In looking at the code, it's not clear what's being waited on, except that four of the threads are stuck in BaseParser.parseDirObject(), apparently waiting on the (synchronized) toString() method of a [local] StringBuffer.  (StringBuffer is used in BaseParser.java where a StringBuilder is clearly preferable -- this is probably true for every local instance of a StringBuffer.)
> I don't know why that would cause the thread to sit waiting, though (JVM problem? 1.6.0_20 on Linux), and the other two threads appear to be waiting on COSObjectKey.getNumber() [perhaps], and I see no synchronized objects or methods there.
> Finding the cause of the lockup would be preferable, but replacing StringBuffer with StringBuilder whereever they are used locally (possibly including any private non-static members) would be an improvement in performance if nothing else.
> Threads 1, 2, 4, & 6:
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:1013)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:157)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:233)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:929)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:519)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> Threads 3 & 5:
> 	at org.apache.pdfbox.cos.COSDocument.getObjectFromPool(COSDocument.java:481)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:540)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> I should add: these same tests, using the same files, have run literally hundreds of times on various machines (Windows, Mac, Linux) using PDFBox 1.2.1 without any lockup.
> Clearly I can back off parallelizing my tests for now, but there is no obvious reason why PDDocument.load() can't be called in parallel, and so it concerns me that this will be a real problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PDFBOX-845) Lockup in PDDocument.load() --> PDFParser.parseObject() with 6 threads

Posted by "Adam Nichols (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PDFBOX-845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916548#action_12916548 ] 

Adam Nichols commented on PDFBOX-845:
-------------------------------------

Based on this "these same tests, using the same files, have run literally hundreds of times on various machines (Windows, Mac, Linux) using PDFBox 1.2.1 without any lockup." it sounds like it is a potential JVM bug.

Can you run your tests on that same machine using 1.5?

If you haven't already, try changing the StringBuffer objects to StringBuilder objects and see if that resolves the issue.  I doubt it will make any difference, but it's worth a shot.  Let us know what you find.

Also, keep in mind: http://pdfbox.apache.org/userguide/faq.html#pdfbox_threadsafe


> Lockup in PDDocument.load() --> PDFParser.parseObject() with 6 threads
> ----------------------------------------------------------------------
>
>                 Key: PDFBOX-845
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-845
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.2.1
>         Environment: Linux lxdev01 2.6.32-24-generic #42-Ubuntu SMP Fri Aug 20 14:24:04 UTC 2010 i686 GNU/Linux
> java version "1.6.0_20"
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) Server VM (build 16.3-b01, mixed mode)
> testng 5.11
> running under maven-surefire plugin v2.5 with parallel=methods and threadCount=6
>            Reporter: Larry West
>            Priority: Critical
>         Attachments: pddocument-load-lockup.stack
>
>   Original Estimate: 32h
>  Remaining Estimate: 32h
>
> This is a TestNG unit test suite, with each test loading a different PDF via PDDocument.load().  I just switched TestNG to parallel=methods (had been serial) and it locked up first time.   "jstack -l" output will be attached, but I'm putting the pdfbox portions here (just below).
> In looking at the code, it's not clear what's being waited on, except that four of the threads are stuck in BaseParser.parseDirObject(), apparently waiting on the (synchronized) toString() method of a [local] StringBuffer.  (StringBuffer is used in BaseParser.java where a StringBuilder is clearly preferable -- this is probably true for every local instance of a StringBuffer.)
> I don't know why that would cause the thread to sit waiting, though (JVM problem? 1.6.0_20 on Linux), and the other two threads appear to be waiting on COSObjectKey.getNumber() [perhaps], and I see no synchronized objects or methods there.
> Finding the cause of the lockup would be preferable, but replacing StringBuffer with StringBuilder whereever they are used locally (possibly including any private non-static members) would be an improvement in performance if nothing else.
> Threads 1, 2, 4, & 6:
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:1013)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:157)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:233)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:929)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:519)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> Threads 3 & 5:
> 	at org.apache.pdfbox.cos.COSDocument.getObjectFromPool(COSDocument.java:481)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:540)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> I should add: these same tests, using the same files, have run literally hundreds of times on various machines (Windows, Mac, Linux) using PDFBox 1.2.1 without any lockup.
> Clearly I can back off parallelizing my tests for now, but there is no obvious reason why PDDocument.load() can't be called in parallel, and so it concerns me that this will be a real problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PDFBOX-845) Lockup in PDDocument.load() --> PDFParser.parseObject() with 6 threads

Posted by "Larry West (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Larry West updated PDFBOX-845:
------------------------------

    Attachment: pddocument-load-lockup.stack

This is the output of "jstack -l", with the names of our routines (that call PDDocument.load(File) excised.



> Lockup in PDDocument.load() --> PDFParser.parseObject() with 6 threads
> ----------------------------------------------------------------------
>
>                 Key: PDFBOX-845
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-845
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.2.1
>         Environment: Linux lxdev01 2.6.32-24-generic #42-Ubuntu SMP Fri Aug 20 14:24:04 UTC 2010 i686 GNU/Linux
> java version "1.6.0_20"
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) Server VM (build 16.3-b01, mixed mode)
> testng 5.11
> running under maven-surefire plugin v2.5 with parallel=methods and threadCount=6
>            Reporter: Larry West
>            Priority: Critical
>         Attachments: pddocument-load-lockup.stack
>
>   Original Estimate: 32h
>  Remaining Estimate: 32h
>
> This is a TestNG unit test suite, with each test loading a different PDF via PDDocument.load().  I just switched TestNG to parallel=methods (had been serial) and it locked up first time.   "jstack -l" output will be attached, but I'm putting the pdfbox portions here (just below).
> In looking at the code, it's not clear what's being waited on, except that four of the threads are stuck in BaseParser.parseDirObject(), apparently waiting on the (synchronized) toString() method of a [local] StringBuffer.  (StringBuffer is used in BaseParser.java where a StringBuilder is clearly preferable -- this is probably true for every local instance of a StringBuffer.)
> I don't know why that would cause the thread to sit waiting, though (JVM problem? 1.6.0_20 on Linux), and the other two threads appear to be waiting on COSObjectKey.getNumber() [perhaps], and I see no synchronized objects or methods there.
> Finding the cause of the lockup would be preferable, but replacing StringBuffer with StringBuilder whereever they are used locally (possibly including any private non-static members) would be an improvement in performance if nothing else.
> Threads 1, 2, 4, & 6:
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:1013)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:157)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:233)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:929)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:519)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
> Threads 3 & 5:
> 	at org.apache.pdfbox.cos.COSDocument.getObjectFromPool(COSDocument.java:481)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:540)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.