You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2014/10/23 09:27:34 UTC

[jira] [Comment Edited] (PDFBOX-2441) Improve XRef self healing mechanism when more than one xref table

    [ https://issues.apache.org/jira/browse/PDFBOX-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180256#comment-14180256 ] 

Tilman Hausherr edited comment on PDFBOX-2441 at 10/23/14 7:26 AM:
-------------------------------------------------------------------

{code}
I got a DataFormatException while parsing the object stream 69 0 R. Any ideas?
{code}
Sorry, I just see that the file I attached doesn't display properly in AR.

Recently I rechecked many of the files that had such offset problems, and a few of them had new (different) problems, also in AR.

Here's what probably happened: these files were uploaded in ASCII mode. This didn't make any trouble for some older PDF files that were encoded with ascii85, but did make trouble for files like this one, that have Flate decode.


was (Author: tilman):
{code}
I got a DataFormatException while parsing the object stream 69 0 R. Any ideas?
{code}
Sorry, I just see that the file I attached doesn't display properly in AR.

Recently I rechecked many of the files that had such offset problems, and a few of them had new problems, also in AR.

Here's what probably happened: these files were uploaded in ASCII mode. This didn't make any trouble for some older PDF files that were encoded with ascii85, but did make trouble for files like this one, that have Flate decode.

> Improve XRef self healing mechanism when more than one xref table
> -----------------------------------------------------------------
>
>                 Key: PDFBOX-2441
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2441
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.8.7, 1.8.8, 2.0.0
>            Reporter: Tilman Hausherr
>            Assignee: Andreas Lehmkühler
>             Fix For: 1.8.8, 2.0.0
>
>         Attachments: 260105.pdf
>
>
> This is a follow-up issue to PDFBOX-2250:
> {quote}
> the xref repair algorithm simply searches for the nearest offset, which may fail if more than one xref table is present
> ...
> Once we have a sample pdf which can't be parsed with the simple algorithm, we can open a new issue.
> {quote}
> And here's one:
> {code}
> Exception in thread "main" java.io.IOException: Error: Expected a long type at offset 1180, instead got '50/Filter/FlateDecode/DecodeParms'
>         at org.apache.pdfbox.pdfparser.BaseParser.readLong(BaseParser.java:1690)
> {code}
> That file does have more than one xref table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)