You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Justin LeFebvre (JIRA)" <ji...@apache.org> on 2009/03/05 18:13:57 UTC
[jira] Commented: (PDFBOX-435) Handling of trailers
[ https://issues.apache.org/jira/browse/PDFBOX-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679250#action_12679250 ]
Justin LeFebvre commented on PDFBOX-435:
----------------------------------------
I have tried out the proposed fix on the file in question and it is working. I also ran the regression test with the proposed fix and it also worked.
> Handling of trailers
> --------------------
>
> Key: PDFBOX-435
> URL: https://issues.apache.org/jira/browse/PDFBOX-435
> Project: PDFBox
> Issue Type: Improvement
> Components: Parsing
> Affects Versions: 0.8.0-incubator
> Reporter: Anonymous
> Priority: Minor
> Attachments: trailerNoEOL.pdf
>
>
> Some PDF generating tools seem to produce invalid trailers but can nevertheless be displayed e.g. by Acrobat Reader.
> Therefore, it would be nice if PDFBox could also process these documents.
> Example 1 (no EOL behind "trailer", as generated by "ScanSoft PDF Create! 4", attached you will find an example):
> trailer<</Root 4 0 R/Info 1 0 R/Size 10/Prev 2979/ID[<00000000000000000000000000000000><215eab4c095713feb4cdbb15a9eba968>]>>
> Example 2 (not EOL but just a blank behind "trailer", cannot publish my example):
> trailer <<
> /Size 26
> /Root 24 0 R
> /Info 25 0 R
> /ID[<98fc28410100000042090000d1d1a606><98fc28410100000042090000d1d1a606>]
> >>:
> Here is a fix proposal:
> private boolean parseTrailer() throws IOException
> {
> if(pdfSource.peek() != 't'){
> return false;
> }
> //read "trailer"
> String nextLine = readLine();
> if( !nextLine.equals( "trailer" ) ) {
> // fix for example no 1 and no 2
> // in some cases the EOL is missing and the trailer immediately continues with "<<" or with a blank character
> // even if this does not comply with PDF reference we want to support as many PDFs as possible
> // Acrobat reader can also deal with this.
> if (nextLine.startsWith("trailer"))
> {
> byte[] b = nextLine.getBytes();
> int len = "trailer".length();
> pdfSource.unread(b, len, b.length-len);
> }
> else
> {
> return false;
> }
> }
> // fix for example no2
> // in some cases the EOL is missing and the trailer continues with " <<"
> // even if this does not comply with PDF reference we want to support as many PDFs as possible
> // Acrobat reader can also deal with this.
> skipSpaces();
>
> COSDictionary parsedTrailer = parseCOSDictionary();
> COSDictionary docTrailer = document.getTrailer();
> if( docTrailer == null )
> {
> document.setTrailer( parsedTrailer );
> }
> else
> {
> docTrailer.addAll( parsedTrailer );
> }
> skipSpaces();
> return true;
> }
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.