You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "John Hewson (JIRA)" <ji...@apache.org> on 2014/08/04 21:40:11 UTC

[jira] [Resolved] (PDFBOX-1268) OutOfMemory Error because of huge colors

     [ https://issues.apache.org/jira/browse/PDFBOX-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Hewson resolved PDFBOX-1268.
---------------------------------

       Resolution: Fixed
    Fix Version/s: 2.0.0

The most colors in a colorspace is a DeviceN which has an implementation limit in Acrobat of 32 colors. I've added a sanity check for the colors as follows:

{code}
colors = Math.min(dict.getInt(COSName.COLORS, 32);
{code}

> OutOfMemory Error because of huge colors
> ----------------------------------------
>
>                 Key: PDFBOX-1268
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1268
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.6.0
>            Reporter: Christophe Vandeplas
>             Fix For: 2.0.0
>
>         Attachments: CVE-2009-3957 PDF 2009-09-21_RANDInfo.pdf
>
>
> Hi,
> Am 26.03.2012 07:42, schrieb Christophe Vandeplas:
>     Hello List,
>     I'm working on a PDF scanning tool and with a specific (malicious) PDF
>     I always get OutOfMemory Errors.
>     The backtrace is:
>     Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>            at org.apache.pdfbox.filter.FlateFilter.decodePredictor(FlateFilter.java:218)
>            at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:170)
>            at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:279)
>            at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:221)
>            at org.apache.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:156)
>            at ScanPdf.checkCOSBaseObject(ScanPdf.java:199)
>             ...
>     When looking in the PDFBox code FlateFilter.java:218 is
>     byte[] lastline = new byte[rowlength];
>     In that contact rowlength = 1073741838   =>  seems rather big, no?
>     Looking back in the code it seems that it's colors who is so big.
>     Colors seems to be extracted from the dict in FlateFilter.java:96:
>     colors = dict.getInt(COSName.COLORS);
>     The (malicious) PDF has indeed the definition :    /Colors 1073741838
> Hmm, that sounds quite large, but the pdf spec describes the colors value as follows:
> "(May be used only if Predictor is greater than 1) The number of interleaved colour components per sample. Valid values are 1 to 4 (PDF 1.0) and 1 or greater (PDF 1.3). Default value: 1."
>     So my question is now:
>     Is this something I need to catch in my own code, or should PDFBox be
>     patched to catch such issues? (like the catched OutOfMemoryError in
>     FlateFilter:124)
> PDFBox should handle that. Please create an issue on JIRA [1] and attach the pdf in question.
>     Thanks for your expertise
>     Christophe
> BR
> Andreas Lehmkühler



--
This message was sent by Atlassian JIRA
(v6.2#6252)