You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Phil Varner (JIRA)" <ji...@apache.org> on 2010/01/05 21:57:54 UTC
[jira] Closed: (PDFBOX-590) PDFXrefStreamParser iterates when no elements are available

     [ https://issues.apache.org/jira/browse/PDFBOX-590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Phil Varner closed PDFBOX-590.
------------------------------

    Resolution: Fixed

Already fixed in trunk, but with no associated bug.

> PDFXrefStreamParser iterates when no elements are available
> -----------------------------------------------------------
>
>                 Key: PDFBOX-590
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-590
>             Project: PDFBox
>          Issue Type: Bug
>            Reporter: Phil Varner
>
> Exception:
> org.apache.pdfbox.exceptions.WrappedIOException
>         at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:237)
>         at mycompany....
>         at java.lang.Thread.run(Thread.java:534)
> Caused by: java.util.NoSuchElementException
>         at java.util.AbstractList$Itr.next(AbstractList.java:426)
>         at org.apache.pdfbox.pdfparser.PDFXrefStreamParser.parse(PDFXrefStreamParser.java:115)
>         at org.apache.pdfbox.cos.COSDocument.parseXrefStreams(COSDocument.java:538)
>         at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:203)
>         ... 11 more
> PDF file: www.oppenheim.pl/plpl/_download/09_05_11_Archiv.pdf
> This is happening in the PDFXrefStreamParser.parse() method because there is no objIter.hasNext() test to protect the objIter.next() call on line 115. This is an outright bug.
> Specifically, the current code looks like so:
> public void parse() throws IOException {
>     ...
>             Iterator objIter = objNums.iterator(); //<------- here we create the Iterator
>             /*
>              * Calculating the size of the line in bytes
>              */
>             int w0 = xrefFormat.getInt(0);
>             int w1 = xrefFormat.getInt(1);
>             int w2 = xrefFormat.getInt(2);
>             int lineSize = w0 + w1 + w2;
>             
>             while(pdfSource.available() > 0)
>             {
>                 byte[] currLine = new byte[lineSize];
>                 pdfSource.read(currLine);
>                 int type = 0;
>                 /*
>                  * Grabs the number of bytes specified for the first column in
>                  * the W array and stores it.
>                  */
>                 for(int i = 0; i < w0; i++)
>                 {
>                     type += (currLine[i] & 0x00ff) << ((w0 - i - 1)* 8);
>                 }
>                 //Need to remember the current objID
>                 Integer objID = (Integer)objIter.next(); //<---- here we attempt to pull objects out of it.
>                 /*
>                  * 3 different types of entries.
>                  */
>                 switch(type)
>                 {
>                     // ... do stuff ...
>                 }
>             }
>     ...
> }
> The code seems to be written with the assumption that if pdfSource.available() >0 that the object count will have another increment. That seems a bit vulnerable to corrupt streams. Further it is a logic error because the stream seems to contain lines of different types not processed as Xref objects. At least that seems clear from my cursory step through.
> I think it should be
>             while(pdfSource.available() > 0 && objIter.hasNext())
> instead, so the call to next() returns the correct Integer when next() is called later on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.