You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pdfbox.apache.org by "Anshuman (JIRA)" <ji...@apache.org> on 2019/01/22 11:20:00 UTC

[jira] [Commented] (PDFBOX-4441) Memory Leak Issue in case of Processing Large PDF File size > 8 MB

    [ https://issues.apache.org/jira/browse/PDFBOX-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16748630#comment-16748630 ] 

Anshuman commented on PDFBOX-4441:
----------------------------------

*Getting Memory Leak in case of parsing Large PDFs. Memory Leakage occurring after processing the files i.e without any processing heap size is increasing.
Attaching the sample file as well Heap Graph for the same.*

Here is my code 

try

{ PDDocument pdDocument = PDDocument.load(file, MemoryUsageSetting.setupTempFileOnly()); PDFText2HTML pdfText2HTML = new PDFText2HTML(); fileContent = pdfText2HTML.getText(pdDocument); pdDocument.close(); }
catch (IOException e) {
log.error("Exception while parsing the file:{} and the message is: ", file.getName(), e);
}

> Memory Leak Issue in case of Processing Large PDF File size > 8 MB
> ------------------------------------------------------------------
>
>                 Key: PDFBOX-4441
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4441
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.8
>         Environment: Linux
>            Reporter: Anshuman
>            Priority: Major
>              Labels: newbie
>         Attachments: Screenshot 2019-01-22 08.19.53.png, sample.pdf
>
>
> *Getting Memory Leak in case of parsing Large PDFs. Memory Leakage occurring after processing the files i.e without any processing heap size is increasing.
>  Attaching  the sample file as well Heap Graph for the same.*
> *Here is my code *
> try {
>  PDDocument pdDocument = PDDocument.load(file, MemoryUsageSetting.setupTempFileOnly());
>  PDFText2HTML pdfText2HTML = new PDFText2HTML();
>  fileContent = pdfText2HTML.getText(pdDocument);
>  pdDocument.close();
> } catch (IOException e) {
>  log.error("Exception while parsing the file:{} and the message is: ", file.getName(), e);
> }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org