You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "lanshiqin (Jira)" <ji...@apache.org> on 2021/06/16 06:59:00 UTC

[jira] [Created] (PDFBOX-5217) PDF to picture, take up too much memory,easy OOM

lanshiqin created PDFBOX-5217:
---------------------------------

             Summary: PDF to picture, take up too much memory,easy OOM
                 Key: PDFBOX-5217
                 URL: https://issues.apache.org/jira/browse/PDFBOX-5217
             Project: PDFBox
          Issue Type: Bug
    Affects Versions: 3.0.0 JBIG2, 2.0.24
         Environment: Oracle JDK 1.8.0_291-b10
MacOS BigSur  (CPU i5, RAM8GB)
Windows 10 (CPU  i7  2.80GHz, RAM 16GB)
Linux ()
            Reporter: lanshiqin


Conversion of a 20MB PDF file to an image resource consumes more than 8GB of memory and takes 5 minutes. That's an intolerable fact.

Debug found that the memory soared when the file stream was finally read.

This is my code:

 
{code:java}
try(InputStream in = new URL(pdfFileUrl).openStream();
    PDDocument document = PDDocument.load(in, MemoryUsageSetting.setupTempFileOnly())){
    document.setResourceCache(null);
    PDFRenderer renderer = new PDFRenderer(document);
    List<String> imgUrlList = Lists.newArrayList();
    for (int i = 0; i < document.getNumberOfPages(); i++) {
        BufferedImage bufferedImage = renderer.renderImageWithDPI(i, DPI);
        File tempFile = new File(OFFICE_CONVERT_TEMP_DIR + fileName + "_" + i);
        try {
            ImageIO.write(bufferedImage, "png", tempFile);
            imgUrlList.add("upload to media center get url todo "+i);
        } finally {
            FileUtils.deleteQuietly(tempFile);
            bufferedImage.getGraphics().dispose();
        }
    }
    return imgUrlList;
}
{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org