You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2015/03/23 06:56:11 UTC

[jira] [Reopened] (PDFBOX-2723) PDFBox*.tmp files not deleted by COSParser

     [ https://issues.apache.org/jira/browse/PDFBOX-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tilman Hausherr reopened PDFBOX-2723:
-------------------------------------

Please don't close issues; in this project we close resolved issues only after a release.

> PDFBox*.tmp files not deleted by COSParser 
> -------------------------------------------
>
>                 Key: PDFBOX-2723
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2723
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.8.8, 1.8.9, 2.0.0
>         Environment: Windows and Linux, with issue being critical on Linux
>            Reporter: Pascal Essiembre
>            Assignee: Tilman Hausherr
>              Labels: patch
>             Fix For: 1.8.9, 2.0.0
>
>         Attachments: pdfbox.patch
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> When parsing PDFs, temporary files get created under the system temp directory (e.g. PDFBox6525369863339991063.tmp).  All files created for each documents are always deleted except for one.   So each document parsed adds a new tmp file that never gets deleted.  That's likely due to a stream never closed.  When processing many PDFs on Linux in the same JVM instance, we get the crashing error: "Too many files open".  Changing the max file handle on the OS is not always an option.
> I was able to fix this by modifying the {{COSParser}} class to close a {{COSStream}} instance:
> {code:title=COSParser.java, starting on line 312|borderStyle=solid}
>     private long parseXrefObjStream(long objByteOffset, boolean isStandalone) throws IOException
>     {
>         // ---- parse indirect object head
>         readObjectNumber();
>         readGenerationNumber();
>         readExpectedString(OBJ_MARKER, true);
>         COSDictionary dict = parseCOSDictionary();
>         COSStream xrefStream = parseCOSStream(dict);
>         parseXrefStream(xrefStream, (int) objByteOffset, isStandalone);
>         xrefStream.close();  // <--- *** NEW LINE ***
>         return dict.getLong(COSName.PREV);
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org