You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2019/06/13 17:24:00 UTC

[jira] [Commented] (PDFBOX-4559) Parse error reading document from several threads

    [ https://issues.apache.org/jira/browse/PDFBOX-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16863315#comment-16863315 ] 

Tilman Hausherr commented on PDFBOX-4559:
-----------------------------------------

The fix above is just for one of the symptoms and doesn't address the root cause. I don't even know if it can be fixed at all. In the worst case, we'd just update the documentation and you'd have to open the file several times.

I have a fix that makes the test code work properly, sadly it doesn't work if the file is uncompressed ({{WriteDecodedDoc}} operation in pdfbox-app). The idea was to synchronize on {{scratchFile}} in {{COSInputStream.create()}} when it isn't null. The problem is that it doesn't synchronize all accesses to the scratch file.

I also tried to put {{synchronized}} on the methods of {{RandomAccessInputStream}} but that doesn't help at all. My thought was that a concurrent read could take place between a call to {{restorePosition}} and the read a few lines later.

[~tboehme] any ideas? Was this ever intended to allow concurrent access?

> Parse error reading document from several threads
> -------------------------------------------------
>
>                 Key: PDFBOX-4559
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4559
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Documentation, Rendering
>    Affects Versions: 2.0.15
>         Environment: Oracle Java 8 update125 on both Mac OS X and centos
>            Reporter: Jack
>            Priority: Major
>              Labels: concurrency, multithreading, type1, type1font
>         Attachments: test.pdf
>
>
> I got following error while running a simple parallel rendering code. However, the error doesn't happen when I change parallelStream to sequential (stream()). Interestingly, both methods will render exact same images. I saw a possible related ticket PDFBOX-3654. But seems that issue was fixed. I'd like to learn if we have some more bugs related?  
> *Sample code*:
> {code:java}
> PDDocument document = PDDocument.load(new File(pdfFilename));
> List<PDDocument> pdfPages = new Splitter().split(document);
> pdfPages.parallelStream().forEach(page -> {
>  try {
> PDFRenderer renderer = new PDFRenderer(page);
> renderer.renderImageWithDPI(0, 180, ImageType.RGB); // change dpi to your number
> } catch (IOException e) {
>  System.out.println(e);
> }
> try {
>  pdfPage.close();
> } catch (IOException ignored) {
> }
> });
> try {
>  document.close();
> } catch (IOException ignored) {
> }
> {code}
>  
> *Error log*:
> {noformat}
> ERROR [PDType1Font] Can't read the embedded Type1 font POAEND+Gotham-Book
> java.io.IOException: unexpected closing parenthesis
>  at org.apache.fontbox.type1.Type1Lexer.readToken(Type1Lexer.java:123) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Lexer.nextToken(Type1Lexer.java:75) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Parser.readValue(Type1Parser.java:398) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Parser.readOtherSubrs(Type1Parser.java:707) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Parser.parseBinary(Type1Parser.java:550) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Parser.parse(Type1Parser.java:64) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Font.createWithSegments(Type1Font.java:85) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:262) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:62) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:146) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:60) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:869) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:505) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:479) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:152) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:265) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:314) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:243) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:229) ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
> WARN [PDType1Font] Using fallback font Helvetica for POAEND+Gotham-Book
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org