You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Adam Nichols (JIRA)" <ji...@apache.org> on 2009/09/01 02:23:32 UTC

[jira] Commented: (PDFBOX-510) Should be able to extract text from "Owner password" protected pdf file without specifing "owner password"?

    [ https://issues.apache.org/jira/browse/PDFBOX-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749700#action_12749700 ] 

Adam Nichols commented on PDFBOX-510:
-------------------------------------

OK, now I understand.  The security settings say you're not allowed to extract text, but there's no encryption or anything to actually stop you from doing so since it is intended to be readable by the user.  So it seems that when there's no user-password, the "security" (in terms of extracting text) is basically just the honor system.  It makes sense that if we can read it without a password, we can extract the text without a password.

On one hand, respecting it gives people a false sense of security, but ignoring it is probably deviates from the PDF spec.  So I'll leave the decision as to whether the security flag should be respected or not to someone else.

> Should be able to extract text from "Owner password" protected pdf file without specifing "owner password"?
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-510
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-510
>             Project: PDFBox
>          Issue Type: Wish
>          Components: Text extraction
>    Affects Versions: 0.8.0-incubator
>            Reporter: Takashi Komatsubara
>         Attachments: Extract_Text_from_OwnerPasswordProtected.patch, PDF1.3-STD40BIT_ownerpass.pdf
>
>
> Hi team,
> Technically, we can do extract text from "Owner" password protected pdf file 
> without specifing "owner" password. Right?
> Do we should be able to do that ? or not.
> The reason why I'm asking is I am using the PDFBox for audting the content 
> of the pdf file.
> So, whether the user want to make "text extract" permission disabled or not, 
> I need to look into the content of the "owner password" protected pdf file.
> Old PDFbox could do this.
> What do you think?
> Takashi
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.