You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Thomas Chojecki (JIRA)" <ji...@apache.org> on 2014/01/08 11:41:50 UTC

[jira] [Updated] (PDFBOX-1614) Digitally sign PDFs without file system access

     [ https://issues.apache.org/jira/browse/PDFBOX-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Chojecki updated PDFBOX-1614:
------------------------------------

    Component/s: Signing

Last year I had a conversation with Ross Woolf which I like to append to inform the watchers about future changes to this problem.

I think there are some people waiting for this solution. The problem is, that an InputStream can be read once or need to be buffered in the memory. For larger documents this isn't the best way so I plan to change the way the pdfbox parse and sign documents without doing unnecessary hacks like we do at the moment in the saveIncremental method.

So I will fix it in two patch waves. First will be to change the FileInputStream to InputStream and do the necessary caching inside the pdfbox.The second will be the suggestion I mentioned in the mail at the bottom.

--------------------------------------------------------------------------------

> At present I have a web service where a document can be submitted to
> the system.  The web service uses MTOM as the document transport for
> the web services.  The code that fronts this for me hands me the
> received document as an InputStream.  Originally I passed this
> stream, along with other data supplied in the web service to my
> document management engine, which then persists the document within
> the DMS.  With the introduction of signing PDFs, once I get the
> InputStream from the web service, I now have to save the document to
> a temporary file on disk, do the signing, and then present it to the
> DMS engine as a stream so that it can now persist it within the DMS.

I have an idea how to solve this.

Parsing the document the second time is only for digest creation. For
this purpose, a DigestInputStream can be used as wrapper for parsing
the document. After this, the saveIncremental need to be changed to
something like doc.saveIncremental(MessageDigest, OutputStream)

MessageDigest md = MessageDigest.getInstance("SHA-256");
InputStream is = null; // This is the incoming stream (dummy)
DigestInputStream dis = new DigestInputStream(is, md);
PDDocument doc = PDDocument.load(dis);
doc.saveIncremental(md, os); // os is the outgoing stream

The DigestInputStream use the MessageDigest object to create the digest
for the stream. In the saveIncrement the same MessageDigest will
continue digesting the bytes from the incremental update. So inside the
pdfbox, md.digest() can be used to gain the needed digest.

This change/improvement would be a larger one, but would clean the
code. I could try to implement it for the upcoming pdfbox 2.0.0 version.

> I would love to be able to omit the part of having to save the temp
> file during the signing and simply have my process consume the stream
> and then pass it on to the DMS engine.  I'm new to working with
> PDF/PDFBox, so I'm just starting to learn what I can and can not do
> with PDF files etc.  From your comment this might be problematic due
> to file sizes.  At least I have it working this way for now and I
> appreciate what you provided to the community to make it possible.

I know your problem and the limitation of streams make it hard to
create an easy solution for pdf signing.

I plan to create a small github project for a signing implementation
based on pdfbox and hope to get help creating something that everyone
can use for signing pdfs. At the moment every user need to write his own
implementation or use the example to do it.

> Digitally sign PDFs without file system access
> ----------------------------------------------
>
>                 Key: PDFBOX-1614
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1614
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: PDModel, Signing, Writing
>    Affects Versions: 1.8.1
>            Reporter: Thierry Boschat
>            Assignee: Thomas Chojecki
>
> Hi I'm using pdfbox-1.8.1 to digitally sign PDFs.
> I find the sample below to handle it.
> But in this example I have to use a FileInputStream however I want to do it only through streams (without any file system access). I tried to extends FileInputStream to deal with it but I failed. Any tips for me about that problem ?
> Thanks.
> File outputDocument = new File("resources/signed" + document.getName());
>     FileInputStream fis = new FileInputStream(document);
>     FileOutputStream fos = new FileOutputStream(outputDocument);
>     int c;
>     while ((c = fis.read(buffer)) != -1)
>     {
>       fos.write(buffer, 0, c);
>     }
>     fis.close();
>     fis = new FileInputStream(outputDocument);
>     // load document
>     PDDocument doc = PDDocument.load(document);
>     // create signature dictionary
>     PDSignature signature = new PDSignature();
>     signature.setFilter(PDSignature.FILTER_ADOBE_PPKLITE); // default filter
>     // subfilter for basic and PAdES Part 2 signatures
>     signature.setSubFilter(PDSignature.SUBFILTER_ADBE_PKCS7_DETACHED);
>     signature.setName("signer name");
>     signature.setLocation("signer location");
>     signature.setReason("reason for signature");
>     // the signing date, needed for valid signature
>     signature.setSignDate(Calendar.getInstance());
>     // register signature dictionary and sign interface
>     doc.addSignature(signature, this);
>     // write incremental (only for signing purpose)
>     doc.saveIncremental(fis, fos);



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)