You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@pdfbox.apache.org by Roberto Nibali <rn...@gmail.com> on 2015/08/27 00:58:57 UTC

Remove document information from PDF

Hi

How can I remove the document information part from a PDF programmatically?
Example:

22 0 obj
<< /Creator (sdjlkadld)
   /Producer (ldksa ls dkalksdj alksd as dlaks dlaskd laskd  sdlak dl)
   /Title (skls  lsdk sd lskd sld)
   /Author (iwej wej wek elwe)
   /Subject (ad lskd jlakd jlaksd l)
   /Keywords (saldkjas dlaks dlas dlaskd laskdlas dls dalsk dk)
>>
endobj
trailer
<< /Size 24
   /Root 23 0 R
   /Info 22 0 R
>>

The above only shows the relevant (obfuscated) portion of the PDF, for the
sake of clarity.

Best regards
Roberto

Re: Remove document information from PDF

Posted by Roberto Nibali <rn...@gmail.com>.

Hi Tilman

On Thu, Aug 27, 2015 at 4:19 PM, Tilman Hausherr <TH...@t-online.de>
wrote:

> Am 27.08.2015 um 13:43 schrieb Roberto Nibali:
>
>> Hi Tilman
>>
>>
>> On Thu, Aug 27, 2015 at 1:21 AM, Tilman Hausherr <TH...@t-online.de>
>> wrote:
>>
>> See the AddMetadataFromDocInfo.java from the examples
>>>
>>>
>>> PDDocumentCatalog catalog = document.getDocumentCatalog();
>>>                  PDDocumentInformation info =
>>> document.getDocumentInformation();
>>>
>>>
>>> you can set stuff... and the example shows you how to do the same for the
>>> XMP meta data.
>>>
>>> See also the ExtractMetadata.java example.
>>>
>>>
>>> Thanks for your valuable input. Last night I was puzzled by your answer,
>> after sleeping, I realized what you meant. I have solved it like follows:
>>
>> private void stripInfo(PDDocument srcDoc) {
>>      PDDocumentInformation docInfo = srcDoc.getDocumentInformation();
>>      docInfo.setAuthor(null);
>>      docInfo.setCreationDate(null);
>>      docInfo.setCreator(null);
>>      docInfo.setKeywords(null);
>>      docInfo.setModificationDate(null);
>>      docInfo.setProducer(null);
>>      docInfo.setSubject(null);
>>      docInfo.setTitle(null);
>>      docInfo.setTrapped(null);
>> }
>>
>> This is almost like you would supposedly do it with iText:
>>
>> HashMap<String, String> info = super.reader.getInfo();
>> info.put("Title", null);
>> info.put("Author", null);
>> info.put("Subject", null);
>> info.put("Keywords", null);
>> info.put("Creator", null);
>> info.put("Producer", null;
>> info.put("CreationDate", null);
>> info.put("ModDate", null);
>> info.put("Trapped", null);
>> stamper.setMoreInfo(info);
>>
>>
>> Best regards
>> Roberto
>>
>>
>
> Yes but be aware that the XMP metadata (open the PDF with an editor and
> search for "XMP") may also have personal information.


Sure. I already wrote some code to detect it. The PDFs to deal with do not
have XMP sections.

Thanks for your continuous valuable help!

Cheers
Roberto

Re: Remove document information from PDF

Posted by Tilman Hausherr <TH...@t-online.de>.

Am 27.08.2015 um 13:43 schrieb Roberto Nibali:
> Hi Tilman
>
>
> On Thu, Aug 27, 2015 at 1:21 AM, Tilman Hausherr <TH...@t-online.de>
> wrote:
>
>> See the AddMetadataFromDocInfo.java from the examples
>>
>>
>> PDDocumentCatalog catalog = document.getDocumentCatalog();
>>                  PDDocumentInformation info =
>> document.getDocumentInformation();
>>
>>
>> you can set stuff... and the example shows you how to do the same for the
>> XMP meta data.
>>
>> See also the ExtractMetadata.java example.
>>
>>
> Thanks for your valuable input. Last night I was puzzled by your answer,
> after sleeping, I realized what you meant. I have solved it like follows:
>
> private void stripInfo(PDDocument srcDoc) {
>      PDDocumentInformation docInfo = srcDoc.getDocumentInformation();
>      docInfo.setAuthor(null);
>      docInfo.setCreationDate(null);
>      docInfo.setCreator(null);
>      docInfo.setKeywords(null);
>      docInfo.setModificationDate(null);
>      docInfo.setProducer(null);
>      docInfo.setSubject(null);
>      docInfo.setTitle(null);
>      docInfo.setTrapped(null);
> }
>
> This is almost like you would supposedly do it with iText:
>
> HashMap<String, String> info = super.reader.getInfo();
> info.put("Title", null);
> info.put("Author", null);
> info.put("Subject", null);
> info.put("Keywords", null);
> info.put("Creator", null);
> info.put("Producer", null;
> info.put("CreationDate", null);
> info.put("ModDate", null);
> info.put("Trapped", null);
> stamper.setMoreInfo(info);
>
>
> Best regards
> Roberto
>


Yes but be aware that the XMP metadata (open the PDF with an editor and 
search for "XMP") may also have personal information.

TIlman



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

Re: Remove document information from PDF

Posted by Roberto Nibali <rn...@gmail.com>.

Hi Tilman


On Thu, Aug 27, 2015 at 1:21 AM, Tilman Hausherr <TH...@t-online.de>
wrote:

> See the AddMetadataFromDocInfo.java from the examples
>
>
> PDDocumentCatalog catalog = document.getDocumentCatalog();
>                 PDDocumentInformation info =
> document.getDocumentInformation();
>
>
> you can set stuff... and the example shows you how to do the same for the
> XMP meta data.
>
> See also the ExtractMetadata.java example.
>
>
Thanks for your valuable input. Last night I was puzzled by your answer,
after sleeping, I realized what you meant. I have solved it like follows:

private void stripInfo(PDDocument srcDoc) {
    PDDocumentInformation docInfo = srcDoc.getDocumentInformation();
    docInfo.setAuthor(null);
    docInfo.setCreationDate(null);
    docInfo.setCreator(null);
    docInfo.setKeywords(null);
    docInfo.setModificationDate(null);
    docInfo.setProducer(null);
    docInfo.setSubject(null);
    docInfo.setTitle(null);
    docInfo.setTrapped(null);
}

This is almost like you would supposedly do it with iText:

HashMap<String, String> info = super.reader.getInfo();
info.put("Title", null);
info.put("Author", null);
info.put("Subject", null);
info.put("Keywords", null);
info.put("Creator", null);
info.put("Producer", null;
info.put("CreationDate", null);
info.put("ModDate", null);
info.put("Trapped", null);
stamper.setMoreInfo(info);


Best regards
Roberto

Re: Remove document information from PDF

Posted by Tilman Hausherr <TH...@t-online.de>.

See the AddMetadataFromDocInfo.java from the examples


PDDocumentCatalog catalog = document.getDocumentCatalog();
                 PDDocumentInformation info = 
document.getDocumentInformation();


you can set stuff... and the example shows you how to do the same for 
the XMP meta data.

See also the ExtractMetadata.java example.

Tilman

Am 27.08.2015 um 00:58 schrieb Roberto Nibali:
> Hi
>
> How can I remove the document information part from a PDF programmatically?
> Example:
>
> 22 0 obj
> << /Creator (sdjlkadld)
>     /Producer (ldksa ls dkalksdj alksd as dlaks dlaskd laskd  sdlak dl)
>     /Title (skls  lsdk sd lskd sld)
>     /Author (iwej wej wek elwe)
>     /Subject (ad lskd jlakd jlaksd l)
>     /Keywords (saldkjas dlaks dlas dlaskd laskdlas dls dalsk dk)
> endobj
> trailer
> << /Size 24
>     /Root 23 0 R
>     /Info 22 0 R
> The above only shows the relevant (obfuscated) portion of the PDF, for the
> sake of clarity.
>
> Best regards
> Roberto
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org