You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Enrico Donelli (JIRA)" <ji...@apache.org> on 2011/05/05 09:32:03 UTC
[jira] [Created] (TIKA-654) Resources not properly closed
Resources not properly closed
-----------------------------
Key: TIKA-654
URL: https://issues.apache.org/jira/browse/TIKA-654
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 1.0
Environment: Tested on OSX and Linux debian
Reporter: Enrico Donelli
We have a thread which parser > 200k files, and we always get "too many open files open" error from operating system. Using lsof I noticed tha apache-tika temp files (created by class temporaryFiles) are not really deleted by operating system, even if delete method returns true.
Searching in the code, I found that the problem (which does not manifest with all the files) is probably in TikaInputStream#close method. Here opencontainer is set to null, but in case of opencontainer instance of org.apache.poi.poifs.filesystem.NPOIFSFileSystem the problems disappear if I call close() on opencontainer. I modified the NPOIFSFileSystem class to implement java.io.Closeable, and modified TikaInputStream#close method to make
if (openContainer instanceof java.io.Closeable) {
((java.io.Closeable) openContainer).close();
}
openContainer = null;
I don't know if this is the best solution, but it seems to solve the problem for me.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (TIKA-654) Resources not properly closed
Posted by "Nick Burch (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nick Burch resolved TIKA-654.
-----------------------------
Resolution: Fixed
Fix Version/s: 1.0
Assignee: Nick Burch
> Resources not properly closed
> -----------------------------
>
> Key: TIKA-654
> URL: https://issues.apache.org/jira/browse/TIKA-654
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.0
> Environment: Tested on OSX and Linux debian
> Reporter: Enrico Donelli
> Assignee: Nick Burch
> Fix For: 1.0
>
>
> We have a thread which parser > 200k files, and we always get "too many open files open" error from operating system. Using lsof I noticed tha apache-tika temp files (created by class temporaryFiles) are not really deleted by operating system, even if delete method returns true.
> Searching in the code, I found that the problem (which does not manifest with all the files) is probably in TikaInputStream#close method. Here opencontainer is set to null, but in case of opencontainer instance of org.apache.poi.poifs.filesystem.NPOIFSFileSystem the problems disappear if I call close() on opencontainer. I modified the NPOIFSFileSystem class to implement java.io.Closeable, and modified TikaInputStream#close method to make
> if (openContainer instanceof java.io.Closeable) {
> ((java.io.Closeable) openContainer).close();
> }
> openContainer = null;
> I don't know if this is the best solution, but it seems to solve the problem for me.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TIKA-654) Resources not properly closed
Posted by "Nick Burch (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029687#comment-13029687 ]
Nick Burch commented on TIKA-654:
---------------------------------
I've made NPOIFSFileSystem and OPCPackage closeable in r1100013. That'll be in POI 3.8 beta 3
In r1100015 I've made TikaInputStream close the open container as you suggest, thanks for that. For now you'll need to use a nightly build (or your custom build) of POI to see the effect of that, but it'll kick in properly when 3.8 beta 3 is out.
> Resources not properly closed
> -----------------------------
>
> Key: TIKA-654
> URL: https://issues.apache.org/jira/browse/TIKA-654
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.0
> Environment: Tested on OSX and Linux debian
> Reporter: Enrico Donelli
> Fix For: 1.0
>
>
> We have a thread which parser > 200k files, and we always get "too many open files open" error from operating system. Using lsof I noticed tha apache-tika temp files (created by class temporaryFiles) are not really deleted by operating system, even if delete method returns true.
> Searching in the code, I found that the problem (which does not manifest with all the files) is probably in TikaInputStream#close method. Here opencontainer is set to null, but in case of opencontainer instance of org.apache.poi.poifs.filesystem.NPOIFSFileSystem the problems disappear if I call close() on opencontainer. I modified the NPOIFSFileSystem class to implement java.io.Closeable, and modified TikaInputStream#close method to make
> if (openContainer instanceof java.io.Closeable) {
> ((java.io.Closeable) openContainer).close();
> }
> openContainer = null;
> I don't know if this is the best solution, but it seems to solve the problem for me.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira