You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Brian Carrier (JIRA)" <ji...@apache.org> on 2009/04/08 15:43:13 UTC
[jira] Resolved: (PDFBOX-435) Handling of trailers
[ https://issues.apache.org/jira/browse/PDFBOX-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brian Carrier resolved PDFBOX-435.
----------------------------------
Resolution: Fixed
Checked into trunk.
Sending trunk/src/main/java/org/apache/pdfbox/pdfparser/PDFParser.java
Transmitting file data .
Committed revision 763243.
> Handling of trailers
> --------------------
>
> Key: PDFBOX-435
> URL: https://issues.apache.org/jira/browse/PDFBOX-435
> Project: PDFBox
> Issue Type: Improvement
> Components: Parsing
> Affects Versions: 0.8.0-incubator
> Reporter: Anonymous
> Priority: Minor
> Attachments: trailerNoEOL.pdf
>
>
> Some PDF generating tools seem to produce invalid trailers but can nevertheless be displayed e.g. by Acrobat Reader.
> Therefore, it would be nice if PDFBox could also process these documents.
> Example 1 (no EOL behind "trailer", as generated by "ScanSoft PDF Create! 4", attached you will find an example):
> trailer<</Root 4 0 R/Info 1 0 R/Size 10/Prev 2979/ID[<00000000000000000000000000000000><215eab4c095713feb4cdbb15a9eba968>]>>
> Example 2 (not EOL but just a blank behind "trailer", cannot publish my example):
> trailer <<
> /Size 26
> /Root 24 0 R
> /Info 25 0 R
> /ID[<98fc28410100000042090000d1d1a606><98fc28410100000042090000d1d1a606>]
> >>:
> Here is a fix proposal:
> private boolean parseTrailer() throws IOException
> {
> if(pdfSource.peek() != 't'){
> return false;
> }
> //read "trailer"
> String nextLine = readLine();
> if( !nextLine.equals( "trailer" ) ) {
> // fix for example no 1 and no 2
> // in some cases the EOL is missing and the trailer immediately continues with "<<" or with a blank character
> // even if this does not comply with PDF reference we want to support as many PDFs as possible
> // Acrobat reader can also deal with this.
> if (nextLine.startsWith("trailer"))
> {
> byte[] b = nextLine.getBytes();
> int len = "trailer".length();
> pdfSource.unread(b, len, b.length-len);
> }
> else
> {
> return false;
> }
> }
> // fix for example no2
> // in some cases the EOL is missing and the trailer continues with " <<"
> // even if this does not comply with PDF reference we want to support as many PDFs as possible
> // Acrobat reader can also deal with this.
> skipSpaces();
>
> COSDictionary parsedTrailer = parseCOSDictionary();
> COSDictionary docTrailer = document.getTrailer();
> if( docTrailer == null )
> {
> document.setTrailer( parsedTrailer );
> }
> else
> {
> docTrailer.addAll( parsedTrailer );
> }
> skipSpaces();
> return true;
> }
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.