You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Johan G <jo...@comcast.net> on 2012/12/05 16:08:38 UTC

Re: How to get: isOptimized, hasCompressedStreamObjects, allowsCommenting ?

Hi, am still hoping for help on if PDFBox can determine these attributes. Thanks again ----- Original Message -----
| Hi All,
| How can I determine in Java if a PDF file matches these requirements?
| • The file is optimized for Fast Web Viewing.
| • The file does not contain compressed stream objects.
| • The Document restrictions allow Commenting. // Am using
| ap.canModifyAnnotations()

Re: How to get: isOptimized, hasCompressedStreamObjects, allowsCommenting ?

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Hi,


Am 11.12.2012 20:14, schrieb Johan G:
> Oops, trying again but with Q's to start my questions just in case. --- Maruan, thank you. Re: a) Fast Web View / Linearization .... you can look up the existence of a linearization parameter dictionary which must exist within the first 1024 bytes of the pdf. Q: Any further help for how to code that would be deeply appreciated. Re: b) Compressed stream objects ... PDFBox does deal with compressed stream objects just fine.. (no indicator is set) if a compressed stream object is encountered. Q: And for this too please, anyone? Can you somehow parse and then set your own flag if you encounter a compressed stream object during that parsing? I have spent time looking for code examples and have also tried some code snippets. But have not yet been able to accomplish either of the above and may be well off onto the wrong track. Thanks again for any consideration. Johan in Seattle ----- Original Message -----
> | From: "Andreas Lehmkuehler"
> | Hi,
> | your reply was hard to read as any formating was missing ...

Your reply still doesn't contain any line breaks ...hard to read

Here is my former answer:

For every compressed stream one or more filters (FlateFilter, DCTFilter ec.) are 
defined, see [1]. [2] demonstrates how to remove all filters.

BR
Andreas Lehmkühler

[1] 
http://svn.apache.org/repos/asf/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/common/PDStream.java
[2] 
http://svn.apache.org/repos/asf/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/WriteDecodedDoc.java 


Re: How to get: isOptimized, hasCompressedStreamObjects, allowsCommenting ?

Posted by Johan G <jo...@comcast.net>.
Oops, trying again but with Q's to start my questions just in case. --- Maruan, thank you. Re: a) Fast Web View / Linearization .... you can look up the existence of a linearization parameter dictionary which must exist within the first 1024 bytes of the pdf. Q: Any further help for how to code that would be deeply appreciated. Re: b) Compressed stream objects ... PDFBox does deal with compressed stream objects just fine.. (no indicator is set) if a compressed stream object is encountered. Q: And for this too please, anyone? Can you somehow parse and then set your own flag if you encounter a compressed stream object during that parsing? I have spent time looking for code examples and have also tried some code snippets. But have not yet been able to accomplish either of the above and may be well off onto the wrong track. Thanks again for any consideration. Johan in Seattle ----- Original Message -----
| From: "Andreas Lehmkuehler"
| Hi,
| your reply was hard to read as any formating was missing ...

Re: How to get: isOptimized, hasCompressedStreamObjects, allowsCommenting ?

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Hi,

your reply was hard to read as any formating was missing ...

Am 11.12.2012 18:06, schrieb Johan G:
> Maruan, thank you. Re:

> a) Fast Web View / Linearization .... you can look up the existence of a linearization parameter dictionary which must exist within
> the first 1024 bytes of the pdf. Any further help for how to code that would be deeply appreciated .

> Re: b) Compressed stream objects ... PDFBox does deal with compressed stream objects just fine.. (no indicator is set) if a
> compressed stream object is encountered. And for this too please, anyone? Can you somehow parse and then set your own flag if
> you encounter a compressed stream object during that parsing? I have spent time looking for code examples and have also tried
> some code snippets. But have not yet been able to accomplish either of the above and may be well off onto the wrong track.
For every compressed stream one or more filters (FlateFilter, DCTFilter ec.) are 
defined, see [1]. [2] demonstrates how to remove all filters.


> Thanks again for any consideration. Johan in Seattle ----- Original Message -----
 > SNIP

BR
Andreas Lehmkühler

[1] 
http://svn.apache.org/repos/asf/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/common/PDStream.java
[2] 
http://svn.apache.org/repos/asf/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/WriteDecodedDoc.java

Re: How to get: isOptimized, hasCompressedStreamObjects, allowsCommenting ?

Posted by Johan G <jo...@comcast.net>.
Maruan, thank you. Re: a) Fast Web View / Linearization .... you can look up the existence of a linearization parameter dictionary which must exist within the first 1024 bytes of the pdf. Any further help for how to code that would be deeply appreciated . Re: b) Compressed stream objects ... PDFBox does deal with compressed stream objects just fine.. (no indicator is set) if a compressed stream object is encountered. And for this too please, anyone? Can you somehow parse and then set your own flag if you encounter a compressed stream object during that parsing? I have spent time looking for code examples and have also tried some code snippets. But have not yet been able to accomplish either of the above and may be well off onto the wrong track. Thanks again for any consideration. Johan in Seattle ----- Original Message -----
| From: "Maruan Sahyoun" <sa...@fileaffairs.de>
| To: users@pdfbox.apache.org
| Sent: Wednesday, December 5, 2012 7:35:25 AM
| Subject: Re: How to get: isOptimized, hasCompressedStreamObjects,
| allowsCommenting ?
| Hi Johan,
| to best of my knowledge
| a) Fast Web View / Linearization
| PDFBox doesn't have specific code for linearization so something like
| isLinearized() or isFastWebView() is not available. In order to find
| out for yourself if a pdf is linearized you can look up the existence
| of a linearization parameter dictionary which must exist within the
| first 1024 bytes of the pdf. The layout is like this:
| << /Linearized 1.0 % Version
| /L 54567 % File length
| /H [475 598] % Primary hint stream offset and length (part 5)
| /O 45 % Object number of first page’s page object (part 6)
| /E 5437 % Offset of end of first page
| /N 11 % Number of pages in document
| /T 52786 % Offset of first entry in main cross-reference table (part
| 11)
| >>
| The complete spec can be found in Annex F of ISO-32000
| b) Compressed stream objects
| PDFBox does deal with compressed stream objects just fine.
| Unfortunately there is currently no document info I know of which is
| set if a compressed stream object is encountered.
| c) Commenting
| Please review
| http://pdfbox.apache.org/apidocs/org/apache/pdfbox/pdmodel/PDDocument.html#getCurrentAccessPermission%28%29
| and
| http://pdfbox.apache.org/apidocs/org/apache/pdfbox/pdmodel/encryption/AccessPermission.html
| With kind regards
| Maruan Sahyoun
| > Hi, am still hoping for help on if PDFBox can determine these
| > attributes. Thanks again ----- Original Message -----
| > | Hi All,
| > | How can I determine in Java if a PDF file matches these
| > | requirements?
| > | • The file is optimized for Fast Web Viewing.
| > | • The file does not contain compressed stream objects.
| > | • The Document restrictions allow Commenting. // Am using
| > | ap.canModifyAnnotations()

Re: How to get: isOptimized, hasCompressedStreamObjects, allowsCommenting ?

Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
Hi Johan,

to best of my knowledge

a) Fast Web View / Linearization
 PDFBox doesn't have specific code for linearization so something like isLinearized() or isFastWebView() is not available. In order to find out for yourself if a pdf is linearized you can look up the existence of a linearization parameter dictionary which must exist within the first 1024 bytes of the pdf. The layout is like this:

<< /Linearized 1.0 % Version
	/L 54567 % File length
	/H [475 598]  % Primary hint stream offset and length (part 5)
	/O 45 % Object number of first page’s page object (part 6)
	/E 5437 % Offset of end of first page
	/N 11 % Number of pages in document 
	/T 52786 % Offset of first entry in main cross-reference table (part 11) 
>> 

The complete spec can be found in Annex F of  ISO-32000


b) Compressed stream objects
PDFBox does deal with compressed stream objects just fine. Unfortunately there is currently no document info I know of which is set if a compressed stream object is encountered.

c) Commenting
Please review
http://pdfbox.apache.org/apidocs/org/apache/pdfbox/pdmodel/PDDocument.html#getCurrentAccessPermission%28%29 and 
http://pdfbox.apache.org/apidocs/org/apache/pdfbox/pdmodel/encryption/AccessPermission.html

With kind regards

Maruan Sahyoun


> Hi, am still hoping for help on if PDFBox can determine these attributes. Thanks again ----- Original Message -----
> | Hi All,
> | How can I determine in Java if a PDF file matches these requirements?
> | • The file is optimized for Fast Web Viewing.
> | • The file does not contain compressed stream objects.
> | • The Document restrictions allow Commenting. // Am using
> | ap.canModifyAnnotations()