You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Viktor Merkel (Jira)" <ji...@apache.org> on 2022/08/26 06:13:00 UTC

[jira] [Updated] (PDFBOX-5497) OutOfMemoryError - PDFMergerUtility

     [ https://issues.apache.org/jira/browse/PDFBOX-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Viktor Merkel updated PDFBOX-5497:
----------------------------------
    Description: 
Merging a lot of small PDFs by using the PDFMergerUtility results in a java.lang.OutOfMemoryError.

Code to reproduce:
{code:java}
package test.pdfbox.merge;


import java.io.File;
import java.io.IOException;

import org.apache.pdfbox.io.MemoryUsageSetting;
import org.apache.pdfbox.multipdf.PDFMergerUtility;
import org.apache.pdfbox.multipdf.PDFMergerUtility.DocumentMergeMode;

public class PDFMergeTest {

    public static void main(String[] args) throws IOException {

        if (args.length == 0) {
            System.out.println("No source folder set");
            return;
        }


        File folder = new File(args[0]);

        PDFMergerUtility merger = new PDFMergerUtility();

        merger.setDestinationFileName(folder.toString() + "\\merged.pdf");
        merger.setDocumentMergeMode(DocumentMergeMode.OPTIMIZE_RESOURCES_MODE); // has no effect

        //adding the source files (except merged.pdf)
        for (File file : folder.listFiles()) {
            if (!file.getName().equals("merged.pdf")) {
                merger.addSource(file);
            }
        }

        mrger.mergeDocuments(MemoryUsageSetting.setupTempFileOnly());
        System.out.println("DONE");
     }
}

{code}
Create a new source folder, put a simple PDF document (like the one from [https://www.adobe.com/support/products/enterprise/knowledgecenter/media/c4611_sample_explain.pdf] for example), make 1000 copies of it and run the code above with the new created folder as argument and the following Java VM args:
{noformat}
-Xms128M -Xmx128M{noformat}
This results in a OutOfMemory error. Switching the document merge mode to OPTIMIZE_RESOURCES_MODE does not fix the problem.

Does PDFBox not close the PDF sources properly?

 

Increasing the memory is not an option because this problem relates to the sizes of the source documents. Is there any option to directly write the new PDF to output stream without holding it in memory?

  was:
Merging a lot of small PDFs by using the PDFMergerUtility results in a java.lang.OutOfMemoryError.

Code to reproduce:
{code:java}
package test.pdfbox.merge;


import java.io.File;
import java.io.IOException;

import org.apache.pdfbox.io.MemoryUsageSetting;
import org.apache.pdfbox.multipdf.PDFMergerUtility;
import org.apache.pdfbox.multipdf.PDFMergerUtility.DocumentMergeMode;

public class PDFMergeTest {

    public static void main(String[] args) throws IOException {

        if (args.length == 0) {
            System.out.println("No source folder set");
            return;
        }


        File folder = new File(args[0]);

        PDFMergerUtility merger = new PDFMergerUtility();

        merger.setDestinationFileName(folder.toString() + "\\merged.pdf");
        merger.setDocumentMergeMode(DocumentMergeMode.OPTIMIZE_RESOURCES_MODE); // has no effect

        //adding the source files (except merged.pdf)
        for (File file : folder.listFiles()) {
            if (!file.getName().equals("merged.pdf")) {
                merger.addSource(file);
            }
        }

        mrger.mergeDocuments(MemoryUsageSetting.setupTempFileOnly());
        System.out.println("DONE");
     }
}

{code}
Create a new source folder, put a simple PDF document (like the one from [https://www.adobe.com/support/products/enterprise/knowledgecenter/media/c4611_sample_explain.pdf] for example), make 1000 copies of it and run the code above with the new created folder as argument and the following Java VM args:
{noformat}
-Xms128M -Xmx128M{noformat}
This results in a OutOfMemory error. Switching the document merge mode to OPTIMIZE_RESOURCES_MODE does not fix the problem.

Does PDFBox not close the PDF sources properly?

 

Increasing the memory is not an option because this problem relates to the sizes of the source documents. Is there any option to directly stream the content to an OutputStream without consuming the final PDF in memory?


> OutOfMemoryError - PDFMergerUtility
> -----------------------------------
>
>                 Key: PDFBOX-5497
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5497
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 2.0.26
>            Reporter: Viktor Merkel
>            Priority: Major
>
> Merging a lot of small PDFs by using the PDFMergerUtility results in a java.lang.OutOfMemoryError.
> Code to reproduce:
> {code:java}
> package test.pdfbox.merge;
> import java.io.File;
> import java.io.IOException;
> import org.apache.pdfbox.io.MemoryUsageSetting;
> import org.apache.pdfbox.multipdf.PDFMergerUtility;
> import org.apache.pdfbox.multipdf.PDFMergerUtility.DocumentMergeMode;
> public class PDFMergeTest {
>     public static void main(String[] args) throws IOException {
>         if (args.length == 0) {
>             System.out.println("No source folder set");
>             return;
>         }
>         File folder = new File(args[0]);
>         PDFMergerUtility merger = new PDFMergerUtility();
>         merger.setDestinationFileName(folder.toString() + "\\merged.pdf");
>         merger.setDocumentMergeMode(DocumentMergeMode.OPTIMIZE_RESOURCES_MODE); // has no effect
>         //adding the source files (except merged.pdf)
>         for (File file : folder.listFiles()) {
>             if (!file.getName().equals("merged.pdf")) {
>                 merger.addSource(file);
>             }
>         }
>         mrger.mergeDocuments(MemoryUsageSetting.setupTempFileOnly());
>         System.out.println("DONE");
>      }
> }
> {code}
> Create a new source folder, put a simple PDF document (like the one from [https://www.adobe.com/support/products/enterprise/knowledgecenter/media/c4611_sample_explain.pdf] for example), make 1000 copies of it and run the code above with the new created folder as argument and the following Java VM args:
> {noformat}
> -Xms128M -Xmx128M{noformat}
> This results in a OutOfMemory error. Switching the document merge mode to OPTIMIZE_RESOURCES_MODE does not fix the problem.
> Does PDFBox not close the PDF sources properly?
>  
> Increasing the memory is not an option because this problem relates to the sizes of the source documents. Is there any option to directly write the new PDF to output stream without holding it in memory?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org