You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Pascal Essiembre (JIRA)" <ji...@apache.org> on 2015/03/22 05:46:10 UTC
[jira] [Created] (PDFBOX-2723) PDFBox*.tmp files not deleted by
COSParser
Pascal Essiembre created PDFBOX-2723:
----------------------------------------
Summary: PDFBox*.tmp files not deleted by COSParser
Key: PDFBOX-2723
URL: https://issues.apache.org/jira/browse/PDFBOX-2723
Project: PDFBox
Issue Type: Bug
Components: Parsing
Affects Versions: 2.0.0
Environment: Windows and Linux, with issue being critical on Linux
Reporter: Pascal Essiembre
Fix For: 2.0.0
When parsing PDFs, temporary files get created under the system temp directory (e.g. PDFBox6525369863339991063.tmp). All files created for each documents are always deleted except for one. So each document parsed adds a new tmp file that never gets deleted. That's likely due to a stream never closed. When processing many PDFs on Linux in the same JVM instance, we get the crashing error: "Too many files open". Changing the max file handle on the OS is not always an option.
I was able to fix this by modifying the {{COSParser}} class to close a {{COSStream}} instance:
{code:title=COSParser.java, starting on line 312|borderStyle=solid}
private long parseXrefObjStream(long objByteOffset, boolean isStandalone) throws IOException
{
// ---- parse indirect object head
readObjectNumber();
readGenerationNumber();
readExpectedString(OBJ_MARKER, true);
COSDictionary dict = parseCOSDictionary();
COSStream xrefStream = parseCOSStream(dict);
parseXrefStream(xrefStream, (int) objByteOffset, isStandalone);
xrefStream.close(); // <--- *** NEW LINE ***
return dict.getLong(COSName.PREV);
}
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org