You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "philip huang (JIRA)" <ji...@apache.org> on 2012/06/06 09:40:22 UTC

[jira] [Created] (PDFBOX-1331) Can't load any text when font is null

philip huang created PDFBOX-1331:
------------------------------------

             Summary: Can't load any text when font is null
                 Key: PDFBOX-1331
                 URL: https://issues.apache.org/jira/browse/PDFBOX-1331
             Project: PDFBox
          Issue Type: Bug
          Components: PDModel
    Affects Versions: 1.7.0, 1.8.0
         Environment: JDK 1.6 64bit
            Reporter: philip huang


Open 19472133.PDF PdfboxReader without "-nonSeq" parameter.
Turn to page 3, many NullPointerExceptions are displayed, and pdfviewer can't show any text.

java.lang.NullPointerException
	at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:366)
	at org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:62)
	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
java.util.EmptyStackException
	at java.util.Stack.peek(Stack.java:85)
	at org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:601)
	at org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:54)
	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
	at javax.swing.JComponent.paintChildren(JComponent.java:862)








Open document with "-nonSeq" parameter

Exception in thread "main" java.io.IOException: Error reading stream using length value. Expected='endstream' actual='' 
	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseCOSStream(NonSequentialPDFParser.java:1327)
	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:1032)
	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:955)
	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseDictObjects(NonSequentialPDFParser.java:929)
	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:337)
	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:574)
	at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1124)
	at org.apache.pdfbox.PDFReader.parseDocument(PDFReader.java:378)
	at org.apache.pdfbox.PDFReader.openPDFFile(PDFReader.java:319)
	at org.apache.pdfbox.PDFReader.main(PDFReader.java:305)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PDFBOX-1331) Can't load any text when font is null

Posted by "philip huang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

philip huang updated PDFBOX-1331:
---------------------------------

    Attachment: 19472133.PDF
    
> Can't load any text when font is null
> -------------------------------------
>
>                 Key: PDFBOX-1331
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1331
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 1.7.0, 1.8.0
>         Environment: JDK 1.6 64bit
>            Reporter: philip huang
>         Attachments: 19472133.PDF
>
>
> Open 19472133.PDF PdfboxReader without "-nonSeq" parameter.
> Turn to page 3, many NullPointerExceptions are displayed, and pdfviewer can't show any text.
> java.lang.NullPointerException
> 	at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:366)
> 	at org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:62)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
> 	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
> 	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
> java.util.EmptyStackException
> 	at java.util.Stack.peek(Stack.java:85)
> 	at org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:601)
> 	at org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:54)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
> 	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
> 	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
> 	at javax.swing.JComponent.paintChildren(JComponent.java:862)
> Open document with "-nonSeq" parameter
> Exception in thread "main" java.io.IOException: Error reading stream using length value. Expected='endstream' actual='' 
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseCOSStream(NonSequentialPDFParser.java:1327)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:1032)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:955)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseDictObjects(NonSequentialPDFParser.java:929)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:337)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:574)
> 	at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1124)
> 	at org.apache.pdfbox.PDFReader.parseDocument(PDFReader.java:378)
> 	at org.apache.pdfbox.PDFReader.openPDFFile(PDFReader.java:319)
> 	at org.apache.pdfbox.PDFReader.main(PDFReader.java:305)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (PDFBOX-1331) Can't load any text when font is null

Posted by "Timo Boehme (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Timo Boehme closed PDFBOX-1331.
-------------------------------

       Resolution: Not A Problem
    Fix Version/s: 1.7.0
         Assignee: Timo Boehme

The document is broken. Values of length parameter in streams are wrong (too large by some bytes e.g. in object 249). Even a fall back to the sequential parser seems not possible with this document.
                
> Can't load any text when font is null
> -------------------------------------
>
>                 Key: PDFBOX-1331
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1331
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 1.7.0, 1.8.0
>         Environment: JDK 1.6 64bit
>            Reporter: philip huang
>            Assignee: Timo Boehme
>             Fix For: 1.7.0
>
>         Attachments: 19472133.PDF
>
>
> Open 19472133.PDF PdfboxReader without "-nonSeq" parameter.
> Turn to page 3, many NullPointerExceptions are displayed, and pdfviewer can't show any text.
> java.lang.NullPointerException
> 	at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:366)
> 	at org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:62)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
> 	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
> 	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
> java.util.EmptyStackException
> 	at java.util.Stack.peek(Stack.java:85)
> 	at org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:601)
> 	at org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:54)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
> 	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
> 	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
> 	at javax.swing.JComponent.paintChildren(JComponent.java:862)
> Open document with "-nonSeq" parameter
> Exception in thread "main" java.io.IOException: Error reading stream using length value. Expected='endstream' actual='' 
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseCOSStream(NonSequentialPDFParser.java:1327)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:1032)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:955)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseDictObjects(NonSequentialPDFParser.java:929)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:337)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:574)
> 	at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1124)
> 	at org.apache.pdfbox.PDFReader.parseDocument(PDFReader.java:378)
> 	at org.apache.pdfbox.PDFReader.openPDFFile(PDFReader.java:319)
> 	at org.apache.pdfbox.PDFReader.main(PDFReader.java:305)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (PDFBOX-1331) Can't load any text when font is null

Posted by "Timo Boehme (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Timo Boehme resolved PDFBOX-1331.
---------------------------------

       Resolution: Duplicate
    Fix Version/s:     (was: 1.7.0)
                   1.8.0

Ok, this issue has more interesting aspects as I though first (thanks for insisting on it).

First it is true that the document is broken. Some readers will hide error messages to the user, but xpdf for instance will show them (you can test it yourself e.g. with object 249 where the length is some bytes too large). Since NonSeqPDFParser currently does not try to repair such documents it will fail.

For the standard parser 1.7.0 introduced an improvement when parsing streams by using length value. Now if this value is wrong it fails too. The solution would be to fall back to old (scanning) stream parsing in such cases. I have created a dedicated issue for this improvement with PDFBOX-1333.

There is another (independent) isssue with PDFStreamEngine.getFonts if no fonts are defined. I will track this in a new issue.

resolved as duplicate of PDFBOX-1333; using this patch allows parsing document with standard parser
                
> Can't load any text when font is null
> -------------------------------------
>
>                 Key: PDFBOX-1331
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1331
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 1.7.0, 1.8.0
>         Environment: JDK 1.6 64bit
>            Reporter: philip huang
>            Assignee: Timo Boehme
>             Fix For: 1.8.0
>
>         Attachments: 19472133.PDF
>
>
> Open 19472133.PDF PdfboxReader without "-nonSeq" parameter.
> Turn to page 3, many NullPointerExceptions are displayed, and pdfviewer can't show any text.
> java.lang.NullPointerException
> 	at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:366)
> 	at org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:62)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
> 	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
> 	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
> java.util.EmptyStackException
> 	at java.util.Stack.peek(Stack.java:85)
> 	at org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:601)
> 	at org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:54)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
> 	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
> 	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
> 	at javax.swing.JComponent.paintChildren(JComponent.java:862)
> Open document with "-nonSeq" parameter
> Exception in thread "main" java.io.IOException: Error reading stream using length value. Expected='endstream' actual='' 
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseCOSStream(NonSequentialPDFParser.java:1327)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:1032)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:955)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseDictObjects(NonSequentialPDFParser.java:929)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:337)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:574)
> 	at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1124)
> 	at org.apache.pdfbox.PDFReader.parseDocument(PDFReader.java:378)
> 	at org.apache.pdfbox.PDFReader.openPDFFile(PDFReader.java:319)
> 	at org.apache.pdfbox.PDFReader.main(PDFReader.java:305)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PDFBOX-1331) Can't load any text when font is null

Posted by "philip huang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PDFBOX-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290694#comment-13290694 ] 

philip huang commented on PDFBOX-1331:
--------------------------------------

Thank you very much. the document is shown correctly on standard mode.
                
> Can't load any text when font is null
> -------------------------------------
>
>                 Key: PDFBOX-1331
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1331
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 1.7.0, 1.8.0
>         Environment: JDK 1.6 64bit
>            Reporter: philip huang
>            Assignee: Timo Boehme
>             Fix For: 1.8.0
>
>         Attachments: 19472133.PDF
>
>
> Open 19472133.PDF PdfboxReader without "-nonSeq" parameter.
> Turn to page 3, many NullPointerExceptions are displayed, and pdfviewer can't show any text.
> java.lang.NullPointerException
> 	at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:366)
> 	at org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:62)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
> 	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
> 	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
> java.util.EmptyStackException
> 	at java.util.Stack.peek(Stack.java:85)
> 	at org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:601)
> 	at org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:54)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
> 	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
> 	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
> 	at javax.swing.JComponent.paintChildren(JComponent.java:862)
> Open document with "-nonSeq" parameter
> Exception in thread "main" java.io.IOException: Error reading stream using length value. Expected='endstream' actual='' 
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseCOSStream(NonSequentialPDFParser.java:1327)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:1032)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:955)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseDictObjects(NonSequentialPDFParser.java:929)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:337)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:574)
> 	at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1124)
> 	at org.apache.pdfbox.PDFReader.parseDocument(PDFReader.java:378)
> 	at org.apache.pdfbox.PDFReader.openPDFFile(PDFReader.java:319)
> 	at org.apache.pdfbox.PDFReader.main(PDFReader.java:305)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Reopened] (PDFBOX-1331) Can't load any text when font is null

Posted by "philip huang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

philip huang reopened PDFBOX-1331:
----------------------------------


I don't think this document is a wrong document. Adobe reader and Foxit reader can also show it correctly.
I think the document is generated by non-standard creator , it maybe include some new PDF features. 


                
> Can't load any text when font is null
> -------------------------------------
>
>                 Key: PDFBOX-1331
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1331
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 1.7.0, 1.8.0
>         Environment: JDK 1.6 64bit
>            Reporter: philip huang
>            Assignee: Timo Boehme
>             Fix For: 1.7.0
>
>         Attachments: 19472133.PDF
>
>
> Open 19472133.PDF PdfboxReader without "-nonSeq" parameter.
> Turn to page 3, many NullPointerExceptions are displayed, and pdfviewer can't show any text.
> java.lang.NullPointerException
> 	at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:366)
> 	at org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:62)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
> 	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
> 	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
> java.util.EmptyStackException
> 	at java.util.Stack.peek(Stack.java:85)
> 	at org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:601)
> 	at org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:54)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
> 	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
> 	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
> 	at javax.swing.JComponent.paintChildren(JComponent.java:862)
> Open document with "-nonSeq" parameter
> Exception in thread "main" java.io.IOException: Error reading stream using length value. Expected='endstream' actual='' 
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseCOSStream(NonSequentialPDFParser.java:1327)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:1032)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:955)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseDictObjects(NonSequentialPDFParser.java:929)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:337)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:574)
> 	at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1124)
> 	at org.apache.pdfbox.PDFReader.parseDocument(PDFReader.java:378)
> 	at org.apache.pdfbox.PDFReader.openPDFFile(PDFReader.java:319)
> 	at org.apache.pdfbox.PDFReader.main(PDFReader.java:305)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (PDFBOX-1331) Can't load any text when font is null

Posted by "philip huang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

philip huang closed PDFBOX-1331.
--------------------------------

    
> Can't load any text when font is null
> -------------------------------------
>
>                 Key: PDFBOX-1331
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1331
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 1.7.0, 1.8.0
>         Environment: JDK 1.6 64bit
>            Reporter: philip huang
>            Assignee: Timo Boehme
>             Fix For: 1.8.0
>
>         Attachments: 19472133.PDF
>
>
> Open 19472133.PDF PdfboxReader without "-nonSeq" parameter.
> Turn to page 3, many NullPointerExceptions are displayed, and pdfviewer can't show any text.
> java.lang.NullPointerException
> 	at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:366)
> 	at org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:62)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
> 	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
> 	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
> java.util.EmptyStackException
> 	at java.util.Stack.peek(Stack.java:85)
> 	at org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:601)
> 	at org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:54)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
> 	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
> 	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
> 	at javax.swing.JComponent.paintChildren(JComponent.java:862)
> Open document with "-nonSeq" parameter
> Exception in thread "main" java.io.IOException: Error reading stream using length value. Expected='endstream' actual='' 
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseCOSStream(NonSequentialPDFParser.java:1327)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:1032)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:955)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseDictObjects(NonSequentialPDFParser.java:929)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:337)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:574)
> 	at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1124)
> 	at org.apache.pdfbox.PDFReader.parseDocument(PDFReader.java:378)
> 	at org.apache.pdfbox.PDFReader.openPDFFile(PDFReader.java:319)
> 	at org.apache.pdfbox.PDFReader.main(PDFReader.java:305)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PDFBOX-1331) Can't load any text when font is null

Posted by "Timo Boehme (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PDFBOX-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290175#comment-13290175 ] 

Timo Boehme commented on PDFBOX-1331:
-------------------------------------

PDFStreamEngine.getFonts issue resolved in PDFBOX-1334
                
> Can't load any text when font is null
> -------------------------------------
>
>                 Key: PDFBOX-1331
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1331
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 1.7.0, 1.8.0
>         Environment: JDK 1.6 64bit
>            Reporter: philip huang
>            Assignee: Timo Boehme
>             Fix For: 1.8.0
>
>         Attachments: 19472133.PDF
>
>
> Open 19472133.PDF PdfboxReader without "-nonSeq" parameter.
> Turn to page 3, many NullPointerExceptions are displayed, and pdfviewer can't show any text.
> java.lang.NullPointerException
> 	at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:366)
> 	at org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:62)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
> 	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
> 	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
> java.util.EmptyStackException
> 	at java.util.Stack.peek(Stack.java:85)
> 	at org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:601)
> 	at org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:54)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:246)
> 	at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
> 	at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
> 	at org.apache.pdfbox.pdfviewer.PDFPagePanel.paint(PDFPagePanel.java:98)
> 	at javax.swing.JComponent.paintChildren(JComponent.java:862)
> Open document with "-nonSeq" parameter
> Exception in thread "main" java.io.IOException: Error reading stream using length value. Expected='endstream' actual='' 
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseCOSStream(NonSequentialPDFParser.java:1327)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:1032)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:955)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseDictObjects(NonSequentialPDFParser.java:929)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:337)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:574)
> 	at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1124)
> 	at org.apache.pdfbox.PDFReader.parseDocument(PDFReader.java:378)
> 	at org.apache.pdfbox.PDFReader.openPDFFile(PDFReader.java:319)
> 	at org.apache.pdfbox.PDFReader.main(PDFReader.java:305)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira