You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Justin LeFebvre (JIRA)" <ji...@apache.org> on 2009/03/05 18:13:57 UTC

[jira] Commented: (PDFBOX-435) Handling of trailers

    [ https://issues.apache.org/jira/browse/PDFBOX-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679250#action_12679250 ] 

Justin LeFebvre commented on PDFBOX-435:
----------------------------------------

I have tried out the proposed fix on the file in question and it is working. I also ran the regression test with the proposed fix and it also worked. 

> Handling of trailers
> --------------------
>
>                 Key: PDFBOX-435
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-435
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Parsing
>    Affects Versions: 0.8.0-incubator
>            Reporter: Anonymous
>            Priority: Minor
>         Attachments: trailerNoEOL.pdf
>
>
> Some PDF generating tools seem to produce invalid trailers but can nevertheless be displayed e.g. by Acrobat Reader.
> Therefore, it would be nice if PDFBox could also process these documents.
> Example 1 (no EOL behind "trailer", as generated by "ScanSoft PDF Create! 4", attached you will find an example):
> trailer<</Root 4 0 R/Info 1 0 R/Size 10/Prev 2979/ID[<00000000000000000000000000000000><215eab4c095713feb4cdbb15a9eba968>]>>
> Example 2 (not EOL but just a blank behind "trailer", cannot publish my example):
> trailer <<
> /Size 26
> /Root 24 0 R
> /Info 25 0 R
> /ID[<98fc28410100000042090000d1d1a606><98fc28410100000042090000d1d1a606>]
> >>: 
> Here is a fix proposal:
>     private boolean parseTrailer() throws IOException
>     {
>         if(pdfSource.peek() != 't'){
>             return false;
>         }
>         //read "trailer"
>         String nextLine = readLine();
>         if( !nextLine.equals( "trailer" ) ) {
>          	// fix for example no 1 and no 2
>         	// in some cases the EOL is missing and the trailer immediately continues with "<<" or with a blank character
>         	// even if this does not comply with PDF reference we want to support as many PDFs as possible
>         	// Acrobat reader can also deal with this.
>         	if (nextLine.startsWith("trailer"))
>         	{
>         		byte[] b = nextLine.getBytes();
>         		int len = "trailer".length();
>         	   	pdfSource.unread(b, len, b.length-len);
>         	} 
>         	else 
>         	{
>             return false;
>         	}
>         }
>         // fix for example no2
>         // in some cases the EOL is missing and the trailer continues with " <<"
>         // even if this does not comply with PDF reference we want to support as many PDFs as possible
>         // Acrobat reader can also deal with this.
>         skipSpaces();
>         
>         COSDictionary parsedTrailer = parseCOSDictionary();
>         COSDictionary docTrailer = document.getTrailer();
>         if( docTrailer == null )
>         {
>             document.setTrailer( parsedTrailer );
>         }
>         else
>         {
>             docTrailer.addAll( parsedTrailer );
>         }
>         skipSpaces();
>         return true;
>     }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.