You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Peter Costello (JIRA)" <ji...@apache.org> on 2013/08/14 02:23:47 UTC

[jira] [Created] (PDFBOX-1694) Bug in org.apache.pdfbox.io.Ascii85InputStream

Peter Costello created PDFBOX-1694:
--------------------------------------

             Summary: Bug in org.apache.pdfbox.io.Ascii85InputStream
                 Key: PDFBOX-1694
                 URL: https://issues.apache.org/jira/browse/PDFBOX-1694
             Project: PDFBox
          Issue Type: Bug
    Affects Versions: 1.7.1
         Environment: Any
            Reporter: Peter Costello


Method 'org.apache.pdfbox.io.Ascii85InputStream.read()' has bug when reading final set of char that are not modulo-4.
Test file="www.mzweb.com.br/grupobimbo/web/arquivos/Bimbo_Historia_20070409_Esp.pdf". 
On page#0 there is a dictionary "323 0 obj << /Length 1492 /Filter [/Ascii85Decode /FlateDecode]>>"
Last set of bytes to decode is "%f" or  0x25, 0x66
Ascii85InputStream pads this to "%f~!!" and correctly generates the final byte 0x0f.
Including the '~' end-of-data char in the padding is a major bug.
If the final padding were "%f!!!", the final byte decoded would be 0x0e (which is wrong).
The correct padding is the 'u' char, or "%fuuu" (See http://en.wikipedia.org/wiki/Ascii85)
This is a quick fix. 
The PDF files for corporate website "Grupo Bimbo" include lots of examples using Ascii85Decode/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira