You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Carl Grundstrom (Jira)" <ji...@apache.org> on 2020/06/19 21:57:00 UTC

[jira] [Created] (PDFBOX-4894) Invalid file offsets for PDF files larger than 2G

Carl Grundstrom created PDFBOX-4894:
---------------------------------------

             Summary: Invalid file offsets for PDF files larger than 2G
                 Key: PDFBOX-4894
                 URL: https://issues.apache.org/jira/browse/PDFBOX-4894
             Project: PDFBox
          Issue Type: Bug
          Components: Parsing
    Affects Versions: 2.0.20
         Environment: Linux
            Reporter: Carl Grundstrom
             Fix For: 2.0.21


An integer is being used to calculate file offsets for COS objects. This works fine for small PDF files, but breaks when the PDF file is larger than 2G. For many large files (136 out of 216 in my sample set), negative file offsets are generated for some of the COS objects due to integer overflow. This results in an IOException being thrown in COSParser.java at line 728. Note that these negative offsets are not valid object stream references.

I have fixed the problem by modifying code in PDFXrefStreamParser.java starting at line 158.

Current code:

int offset = 0;
for(int i = 0; i < w1; i++)
{
  offset += (currLine[i + w0] & 0x00ff) << ((w1 - i - 1) * 8);
}

New code:

long offset = 0;
for(int i = 0; i < w1; i++)
{
  offset += ((long)(currLine[i + w0] & 0x00ff)) << ((w1 - i - 1) * 8);
}

I can submit a sample PDF file if desired (it will be more than 2G in size)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org