You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Cornelis Hoeflake (JIRA)" <ji...@apache.org> on 2016/03/21 18:25:25 UTC

[jira] [Created] (PDFBOX-3280) PDDocument.importPage does not deep clone source page

Cornelis Hoeflake created PDFBOX-3280:
-----------------------------------------

             Summary: PDDocument.importPage does not deep clone source page
                 Key: PDFBOX-3280
                 URL: https://issues.apache.org/jira/browse/PDFBOX-3280
             Project: PDFBox
          Issue Type: Bug
          Components: PDModel
    Affects Versions: 2.0.0
            Reporter: Cornelis Hoeflake


The method PDDocument.importPage does not deep clone the source page. This causes two issues, when closing the source document BEFORE saving the target document throws an already closed exception.

Placing the close after saving the target document works fine. But... When splitting a document into a lot of small documents and than save that documents multithreaded will cause random exceptions like ArrayIndexOutOfBounds, COSStream closed etc.

Check for example the following code. I attach the used source document.

        PDDocument doc = new PDDocument();
        PDDocument load = PDDocument.load(new File(SOURCE_DOC));

        for (int p = 0; p<1000; p++) {
            doc.importPage(load.getPage(0));
        }

        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        doc.save(baos);
        doc.close();
        load.close();

        final PDDocument doc2 = PDDocument.load(baos.toByteArray());
// ok, now we have a big document loaded as it normally will be loaded.
        ExecutorService es = Executors.newFixedThreadPool(4);

        List<PDDocument> docs = Lists.newArrayList();
        for (int p = 0; p<doc2.getNumberOfPages(); p++) {
            final PDDocument newDoc = new PDDocument();
            newDoc.importPage(doc2.getPage(p));
            docs.add(newDoc);
        }
        for (int p = 0; p<doc2.getNumberOfPages(); p++) {
            final int page = p;
            es.submit(new Runnable() {
                @Override
                public void run() {
                    try {
                        PDDocument newDoc = docs.get(page);
                        newDoc.save(new ByteArrayOutputStream());
                        newDoc.close();
                    } catch (IOException e) {
                        e.printStackTrace();
                    }
                }
            });

        }
        es.shutdown();
    }






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org