You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Peter Nordquist (JIRA)" <ji...@apache.org> on 2011/01/31 18:44:28 UTC
[jira] Created: (PDFBOX-953) PDFBox fails to ExtractText from Adobe
Acrobat X 256-bit AES encrypted documents
PDFBox fails to ExtractText from Adobe Acrobat X 256-bit AES encrypted documents
--------------------------------------------------------------------------------
Key: PDFBOX-953
URL: https://issues.apache.org/jira/browse/PDFBOX-953
Project: PDFBox
Issue Type: Bug
Affects Versions: 1.4.0, 1.3.1
Environment: Java: jdk1.6.0_20
OS: Windows 7, RHEL 5.5
Reporter: Peter Nordquist
>From the command line version of PDFBox, this exception is printed out:
ExtractText failed with the following exception:
java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeEncryptedKey(StandardSecurityHandler.java:571)
at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeUserPassword(StandardSecurityHandler.java:608)
at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.isUserPassword(StandardSecurityHandler.java:792)
at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:189)
at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1091)
at org.apache.pdfbox.ExtractText.main(ExtractText.java:190)
at org.apache.pdfbox.PDFBox.main(PDFBox.java:42)
The document I was using was encrypted using Adobe Acrobat X Pro and has only Page Extraction disabled inside of it. It was encrypted only with a permissions password.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PDFBOX-953) PDFBox fails to ExtractText from
Adobe Acrobat X 256-bit AES encrypted documents
Posted by "Martijn Brinkers (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12997168#comment-12997168 ]
Martijn Brinkers commented on PDFBOX-953:
-----------------------------------------
The encryption revision of the document is 6. According to this posting http://forums.adobe.com/thread/763902?tstart=0 this is not yet documented (at least not publicly). We have to wait until it has been documented.
> PDFBox fails to ExtractText from Adobe Acrobat X 256-bit AES encrypted documents
> --------------------------------------------------------------------------------
>
> Key: PDFBOX-953
> URL: https://issues.apache.org/jira/browse/PDFBOX-953
> Project: PDFBox
> Issue Type: New Feature
> Affects Versions: 1.3.1, 1.4.0
> Environment: Java: jdk1.6.0_20
> OS: Windows 7, RHEL 5.5
> Reporter: Peter Nordquist
> Attachments: lorem-ipsum-256AES.pdf
>
>
> From the command line version of PDFBox, this exception is printed out:
> ExtractText failed with the following exception:
> java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeEncryptedKey(StandardSecurityHandler.java:571)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeUserPassword(StandardSecurityHandler.java:608)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.isUserPassword(StandardSecurityHandler.java:792)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:189)
> at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1091)
> at org.apache.pdfbox.ExtractText.main(ExtractText.java:190)
> at org.apache.pdfbox.PDFBox.main(PDFBox.java:42)
> The document I was using was encrypted using Adobe Acrobat X Pro and has only Page Extraction disabled inside of it. It was encrypted only with a permissions password.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-953) PDFBox fails to ExtractText from
Adobe Acrobat X 256-bit AES encrypted documents
Posted by "F. Schmitt (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221921#comment-13221921 ]
F. Schmitt commented on PDFBOX-953:
-----------------------------------
The documentation can be found here:
http://wwwimages.adobe.com/www.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/adobe_supplement_iso32000.pdf
(Chapter 3.5 - Encryption)
The revision of the standard security handler was extended to number 5.
I would also like to see this feature. Or at least an exception, that revision 5 is not yet supported would be cool.
Thanks.
> PDFBox fails to ExtractText from Adobe Acrobat X 256-bit AES encrypted documents
> --------------------------------------------------------------------------------
>
> Key: PDFBOX-953
> URL: https://issues.apache.org/jira/browse/PDFBOX-953
> Project: PDFBox
> Issue Type: New Feature
> Affects Versions: 1.3.1, 1.4.0
> Environment: Java: jdk1.6.0_20
> OS: Windows 7, RHEL 5.5
> Reporter: Peter Nordquist
> Attachments: lorem-ipsum-256AES.pdf
>
>
> From the command line version of PDFBox, this exception is printed out:
> ExtractText failed with the following exception:
> java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeEncryptedKey(StandardSecurityHandler.java:571)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeUserPassword(StandardSecurityHandler.java:608)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.isUserPassword(StandardSecurityHandler.java:792)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:189)
> at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1091)
> at org.apache.pdfbox.ExtractText.main(ExtractText.java:190)
> at org.apache.pdfbox.PDFBox.main(PDFBox.java:42)
> The document I was using was encrypted using Adobe Acrobat X Pro and has only Page Extraction disabled inside of it. It was encrypted only with a permissions password.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-953) PDFBox fails to ExtractText from
Adobe Acrobat X 256-bit AES encrypted documents
Posted by "Ralf Hauser (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13502595#comment-13502595 ]
Ralf Hauser commented on PDFBOX-953:
------------------------------------
see also PDFBOX-135 and PDFBOX-1450
> PDFBox fails to ExtractText from Adobe Acrobat X 256-bit AES encrypted documents
> --------------------------------------------------------------------------------
>
> Key: PDFBOX-953
> URL: https://issues.apache.org/jira/browse/PDFBOX-953
> Project: PDFBox
> Issue Type: New Feature
> Affects Versions: 1.3.1, 1.4.0
> Environment: Java: jdk1.6.0_20
> OS: Windows 7, RHEL 5.5
> Reporter: Peter Nordquist
> Attachments: lorem-ipsum-256AES.pdf
>
>
> From the command line version of PDFBox, this exception is printed out:
> ExtractText failed with the following exception:
> java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeEncryptedKey(StandardSecurityHandler.java:571)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeUserPassword(StandardSecurityHandler.java:608)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.isUserPassword(StandardSecurityHandler.java:792)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:189)
> at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1091)
> at org.apache.pdfbox.ExtractText.main(ExtractText.java:190)
> at org.apache.pdfbox.PDFBox.main(PDFBox.java:42)
> The document I was using was encrypted using Adobe Acrobat X Pro and has only Page Extraction disabled inside of it. It was encrypted only with a permissions password.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PDFBOX-953) PDFBox fails to ExtractText from Adobe
Acrobat X 256-bit AES encrypted documents
Posted by "Peter Nordquist (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Peter Nordquist updated PDFBOX-953:
-----------------------------------
Attachment: lorem-ipsum-256AES.pdf
Attached an example pdf with generated text inside. Permissions password is 'changeit'
> PDFBox fails to ExtractText from Adobe Acrobat X 256-bit AES encrypted documents
> --------------------------------------------------------------------------------
>
> Key: PDFBOX-953
> URL: https://issues.apache.org/jira/browse/PDFBOX-953
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 1.3.1, 1.4.0
> Environment: Java: jdk1.6.0_20
> OS: Windows 7, RHEL 5.5
> Reporter: Peter Nordquist
> Attachments: lorem-ipsum-256AES.pdf
>
>
> From the command line version of PDFBox, this exception is printed out:
> ExtractText failed with the following exception:
> java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeEncryptedKey(StandardSecurityHandler.java:571)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeUserPassword(StandardSecurityHandler.java:608)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.isUserPassword(StandardSecurityHandler.java:792)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:189)
> at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1091)
> at org.apache.pdfbox.ExtractText.main(ExtractText.java:190)
> at org.apache.pdfbox.PDFBox.main(PDFBox.java:42)
> The document I was using was encrypted using Adobe Acrobat X Pro and has only Page Extraction disabled inside of it. It was encrypted only with a permissions password.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PDFBOX-953) PDFBox fails to ExtractText from Adobe
Acrobat X 256-bit AES encrypted documents
Posted by "Andreas Lehmkühler (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andreas Lehmkühler updated PDFBOX-953:
--------------------------------------
Issue Type: New Feature (was: Bug)
It's not a bug, it's a missing feature
> PDFBox fails to ExtractText from Adobe Acrobat X 256-bit AES encrypted documents
> --------------------------------------------------------------------------------
>
> Key: PDFBOX-953
> URL: https://issues.apache.org/jira/browse/PDFBOX-953
> Project: PDFBox
> Issue Type: New Feature
> Affects Versions: 1.3.1, 1.4.0
> Environment: Java: jdk1.6.0_20
> OS: Windows 7, RHEL 5.5
> Reporter: Peter Nordquist
> Attachments: lorem-ipsum-256AES.pdf
>
>
> From the command line version of PDFBox, this exception is printed out:
> ExtractText failed with the following exception:
> java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeEncryptedKey(StandardSecurityHandler.java:571)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeUserPassword(StandardSecurityHandler.java:608)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.isUserPassword(StandardSecurityHandler.java:792)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:189)
> at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1091)
> at org.apache.pdfbox.ExtractText.main(ExtractText.java:190)
> at org.apache.pdfbox.PDFBox.main(PDFBox.java:42)
> The document I was using was encrypted using Adobe Acrobat X Pro and has only Page Extraction disabled inside of it. It was encrypted only with a permissions password.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PDFBOX-953) PDFBox fails to ExtractText from
Adobe Acrobat X 256-bit AES encrypted documents
Posted by "Andreas Lehmkühler (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988833#comment-12988833 ]
Andreas Lehmkühler commented on PDFBOX-953:
-------------------------------------------
That feature seems to be that new that I even can't open it using acrobat reader 9.4.1
> PDFBox fails to ExtractText from Adobe Acrobat X 256-bit AES encrypted documents
> --------------------------------------------------------------------------------
>
> Key: PDFBOX-953
> URL: https://issues.apache.org/jira/browse/PDFBOX-953
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 1.3.1, 1.4.0
> Environment: Java: jdk1.6.0_20
> OS: Windows 7, RHEL 5.5
> Reporter: Peter Nordquist
> Attachments: lorem-ipsum-256AES.pdf
>
>
> From the command line version of PDFBox, this exception is printed out:
> ExtractText failed with the following exception:
> java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeEncryptedKey(StandardSecurityHandler.java:571)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeUserPassword(StandardSecurityHandler.java:608)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.isUserPassword(StandardSecurityHandler.java:792)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:189)
> at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1091)
> at org.apache.pdfbox.ExtractText.main(ExtractText.java:190)
> at org.apache.pdfbox.PDFBox.main(PDFBox.java:42)
> The document I was using was encrypted using Adobe Acrobat X Pro and has only Page Extraction disabled inside of it. It was encrypted only with a permissions password.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Issue Comment Edited: (PDFBOX-953) PDFBox fails to
ExtractText from Adobe Acrobat X 256-bit AES encrypted documents
Posted by "Peter Nordquist (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989293#comment-12989293 ]
Peter Nordquist edited comment on PDFBOX-953 at 2/1/11 6:02 PM:
----------------------------------------------------------------
Yes, sorry I didn't put that in the original description but when securing this PDF via Adobe Acrobat X Pro it does say that it can only be opened by Adobe Acrobat X and later.
Tested with:
Mac OS X 10.6.6 with Adobe Reader X 10.0.0
Windows 7 Enterprise 64-bit with Adobe Acrobat X Pro 10.0.0
was (Author: peter.nordquist@pnl.gov):
Yes, sorry I didn't put that in the original description but when securing this PDF via Adobe Acrobat X Pro it does say that it can only be opened by Adobe Acrobat X and later
> PDFBox fails to ExtractText from Adobe Acrobat X 256-bit AES encrypted documents
> --------------------------------------------------------------------------------
>
> Key: PDFBOX-953
> URL: https://issues.apache.org/jira/browse/PDFBOX-953
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 1.3.1, 1.4.0
> Environment: Java: jdk1.6.0_20
> OS: Windows 7, RHEL 5.5
> Reporter: Peter Nordquist
> Attachments: lorem-ipsum-256AES.pdf
>
>
> From the command line version of PDFBox, this exception is printed out:
> ExtractText failed with the following exception:
> java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeEncryptedKey(StandardSecurityHandler.java:571)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeUserPassword(StandardSecurityHandler.java:608)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.isUserPassword(StandardSecurityHandler.java:792)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:189)
> at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1091)
> at org.apache.pdfbox.ExtractText.main(ExtractText.java:190)
> at org.apache.pdfbox.PDFBox.main(PDFBox.java:42)
> The document I was using was encrypted using Adobe Acrobat X Pro and has only Page Extraction disabled inside of it. It was encrypted only with a permissions password.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PDFBOX-953) PDFBox fails to ExtractText from
Adobe Acrobat X 256-bit AES encrypted documents
Posted by "Peter Nordquist (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989293#comment-12989293 ]
Peter Nordquist commented on PDFBOX-953:
----------------------------------------
Yes, sorry I didn't put that in the original description but when securing this PDF via Adobe Acrobat X Pro it does say that it can only be opened by Adobe Acrobat X and later
> PDFBox fails to ExtractText from Adobe Acrobat X 256-bit AES encrypted documents
> --------------------------------------------------------------------------------
>
> Key: PDFBOX-953
> URL: https://issues.apache.org/jira/browse/PDFBOX-953
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 1.3.1, 1.4.0
> Environment: Java: jdk1.6.0_20
> OS: Windows 7, RHEL 5.5
> Reporter: Peter Nordquist
> Attachments: lorem-ipsum-256AES.pdf
>
>
> From the command line version of PDFBox, this exception is printed out:
> ExtractText failed with the following exception:
> java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeEncryptedKey(StandardSecurityHandler.java:571)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeUserPassword(StandardSecurityHandler.java:608)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.isUserPassword(StandardSecurityHandler.java:792)
> at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:189)
> at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1091)
> at org.apache.pdfbox.ExtractText.main(ExtractText.java:190)
> at org.apache.pdfbox.PDFBox.main(PDFBox.java:42)
> The document I was using was encrypted using Adobe Acrobat X Pro and has only Page Extraction disabled inside of it. It was encrypted only with a permissions password.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira