You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Brian Carrier (JIRA)" <ji...@apache.org> on 2009/04/08 15:43:13 UTC

[jira] Resolved: (PDFBOX-435) Handling of trailers

     [ https://issues.apache.org/jira/browse/PDFBOX-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brian Carrier resolved PDFBOX-435.
----------------------------------

    Resolution: Fixed

Checked into trunk.

Sending        trunk/src/main/java/org/apache/pdfbox/pdfparser/PDFParser.java
Transmitting file data .
Committed revision 763243.

> Handling of trailers
> --------------------
>
>                 Key: PDFBOX-435
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-435
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Parsing
>    Affects Versions: 0.8.0-incubator
>            Reporter: Anonymous
>            Priority: Minor
>         Attachments: trailerNoEOL.pdf
>
>
> Some PDF generating tools seem to produce invalid trailers but can nevertheless be displayed e.g. by Acrobat Reader.
> Therefore, it would be nice if PDFBox could also process these documents.
> Example 1 (no EOL behind "trailer", as generated by "ScanSoft PDF Create! 4", attached you will find an example):
> trailer<</Root 4 0 R/Info 1 0 R/Size 10/Prev 2979/ID[<00000000000000000000000000000000><215eab4c095713feb4cdbb15a9eba968>]>>
> Example 2 (not EOL but just a blank behind "trailer", cannot publish my example):
> trailer <<
> /Size 26
> /Root 24 0 R
> /Info 25 0 R
> /ID[<98fc28410100000042090000d1d1a606><98fc28410100000042090000d1d1a606>]
> >>: 
> Here is a fix proposal:
>     private boolean parseTrailer() throws IOException
>     {
>         if(pdfSource.peek() != 't'){
>             return false;
>         }
>         //read "trailer"
>         String nextLine = readLine();
>         if( !nextLine.equals( "trailer" ) ) {
>          	// fix for example no 1 and no 2
>         	// in some cases the EOL is missing and the trailer immediately continues with "<<" or with a blank character
>         	// even if this does not comply with PDF reference we want to support as many PDFs as possible
>         	// Acrobat reader can also deal with this.
>         	if (nextLine.startsWith("trailer"))
>         	{
>         		byte[] b = nextLine.getBytes();
>         		int len = "trailer".length();
>         	   	pdfSource.unread(b, len, b.length-len);
>         	} 
>         	else 
>         	{
>             return false;
>         	}
>         }
>         // fix for example no2
>         // in some cases the EOL is missing and the trailer continues with " <<"
>         // even if this does not comply with PDF reference we want to support as many PDFs as possible
>         // Acrobat reader can also deal with this.
>         skipSpaces();
>         
>         COSDictionary parsedTrailer = parseCOSDictionary();
>         COSDictionary docTrailer = document.getTrailer();
>         if( docTrailer == null )
>         {
>             document.setTrailer( parsedTrailer );
>         }
>         else
>         {
>             docTrailer.addAll( parsedTrailer );
>         }
>         skipSpaces();
>         return true;
>     }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.