You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by "Dr. André Lanka" <ma...@dr-lanka.de> on 2011/10/19 10:37:28 UTC
File handles remain open
Hi all, :)
We're using Jena 2.6.4, ARQ 2.8.8 and TDB 0.8.10. We use multiple TDB
database instances in parallel. Each of it owns 40-50 file handles
during usage.
To save resources we want to close rarely used database and re-open it
when needed. Currently we do this (basically) by
GraphTriplesTDB graph=(GraphTriplesTDB) rdfStore.getGraph()
TDBMaker.releaseDataset(graph.getDataset());
rdfStore.close();
where rdfStore is a Model created by a
rdfStore=TDBFactory.createModel(directory.getAbsolutePath())
Unfortunately not only the file handles remain open: With each call to
TDBFactory.createModel additional 40 file handles are opened. This ends
up in an IOException due to too many open files.
Perhaps we use the framework the wrong way... Does someone has an idea
what the problem could be?
BTW: rdfStore.isClosed() gives true (in our case) even after a call to
rdfStore.close(). The same applies to graph.isClosed() and graph.close().
Thanks in Advance
André
--
Dr. André Lanka * 0178 / 134 44 47 * http://dr-lanka.de
Re: File handles remain open
Posted by "Dr. André Lanka" <ma...@dr-lanka.de>.
Hi Paolo,
thanks for your prompt answer.
On 19.10.2011 11:43, Paolo Castagna wrote:
> which operating system are you using?
we can reproduce this on various Linux distributions and on a MacOS 10.6.8.
Perhaps it helps:
If we call rdfStore.getGraph().close() additionally some (but not all)
file handles are closed. Yet in this case we have other issues (after
re-opening the database using createModel(...)).
Thanks
André
--
Dr. André Lanka * 0178 / 134 44 47 * http://dr-lanka.de
Re: File handles remain open
Posted by Paolo Castagna <ca...@googlemail.com>.
Dr. André Lanka wrote:
> Hi all, :)
>
> We're using Jena 2.6.4, ARQ 2.8.8 and TDB 0.8.10. We use multiple TDB
> database instances in parallel. Each of it owns 40-50 file handles
> during usage.
Hi André,
which operating system are you using?
Paolo
>
> To save resources we want to close rarely used database and re-open it
> when needed. Currently we do this (basically) by
>
> GraphTriplesTDB graph=(GraphTriplesTDB) rdfStore.getGraph()
> TDBMaker.releaseDataset(graph.getDataset());
> rdfStore.close();
>
> where rdfStore is a Model created by a
>
> rdfStore=TDBFactory.createModel(directory.getAbsolutePath())
>
> Unfortunately not only the file handles remain open: With each call to
> TDBFactory.createModel additional 40 file handles are opened. This ends
> up in an IOException due to too many open files.
>
> Perhaps we use the framework the wrong way... Does someone has an idea
> what the problem could be?
>
> BTW: rdfStore.isClosed() gives true (in our case) even after a call to
> rdfStore.close(). The same applies to graph.isClosed() and graph.close().
>
> Thanks in Advance
> André
>
>
Re: File handles remain open
Posted by "Dr. André Lanka" <ma...@dr-lanka.de>.
Hi Paolo,
On 26.10.2011 12:15, Paolo Castagna wrote:
> You can explicitly call System.gc(), but the JVM will run the GC anyway
> every once in a while... that will close the file handles (even without
> you explicitly call System.gc()), isn't it?
Yes, System.gc() takes place now and then. Unfortunately I was able to
leave more than 2000 file handles open within one hour (moderate usage
of our platform). I could imagine that we'll encounter problems with
open file handles as soon as thousands of users will use it.
Anyway, the framework is great and I'm pretty sure that people usually
do not experience any problems, especially when using one graph with
named subgraphs.
I just wanted to share the information for the small number of users
wondering about "Too many open files" exceptions even if they close the
graph.
> Have you tried not to call System.gc() and monitor your app with a
> profiler or jconsole which will show when the GC is executed and check
> file handlers are closed?
Nope, I only saw that some of the file handles are not closed and remain
open for a very long time.
I think, the reason is that there is enough memory for the VM and gc()
is rarely required (in our case).
Greetings
André
--
Dr. André Lanka * 0178 / 134 44 47 * http://dr-lanka.de
Re: File handles remain open
Posted by Paolo Castagna <ca...@googlemail.com>.
Dr. André Lanka wrote:
> Hello Jena users,
>
> perhaps you'll find the following information useful, perhaps not :)
>
> Problem:
> A few days ago I asked for ideas why TDB has open file handles even if
> one closed the model. This is a crucial point in case you are using
> plenty of graphs. A call to graph.getDataSet().close() helped to close
> most of the handles, but some unused file handles still remain. In long
> running instances (doing re-open graphs often) this can be a problem.
>
> Cause:
> Today I dived into Jena's source code and found in BlockMgrMapped that
> the database files are accessed via memory mapped buffers. As Java seems
> to lack in a possibility to unmap the buffer directly, file handles
> remain open.
>
> Solution:
> The only possibility I found (without changing anything in the Jena
> framework) was to explicitly call System.gc() after dataSet.close().
Hi André,
thanks for sharing your findings on jena-users (even if are known, it
is always good to share what we learn from experience with others).
(By the way, with Windows things are even worst. ;-))
You can explicitly call System.gc(), but the JVM will run the GC anyway
every once in a while... that will close the file handles (even without
you explicitly call System.gc()), isn't it?
Have you tried not to call System.gc() and monitor your app with a
profiler or jconsole which will show when the GC is executed and check
file handlers are closed?
Paolo
> Usually this closes the file handles (because Jena nulled all the
> buffers before [bless the Jena group for this :) ]).
> Of course, System.gc() is only an advice to the VM, but it seems to be
> successful.
>
>
> Thanks
> André
>
Re: File handles remain open
Posted by "Dr. André Lanka" <ma...@dr-lanka.de>.
Hello Jena users,
perhaps you'll find the following information useful, perhaps not :)
Problem:
A few days ago I asked for ideas why TDB has open file handles even if
one closed the model. This is a crucial point in case you are using
plenty of graphs. A call to graph.getDataSet().close() helped to close
most of the handles, but some unused file handles still remain. In long
running instances (doing re-open graphs often) this can be a problem.
Cause:
Today I dived into Jena's source code and found in BlockMgrMapped that
the database files are accessed via memory mapped buffers. As Java seems
to lack in a possibility to unmap the buffer directly, file handles
remain open.
Solution:
The only possibility I found (without changing anything in the Jena
framework) was to explicitly call System.gc() after dataSet.close().
Usually this closes the file handles (because Jena nulled all the
buffers before [bless the Jena group for this :) ]).
Of course, System.gc() is only an advice to the VM, but it seems to be
successful.
Thanks
André
--
Dr. André Lanka * 0178 / 134 44 47 * http://dr-lanka.de
Re: File handles remain open
Posted by "Dr. André Lanka" <ma...@dr-lanka.de>.
Hi Andy,
thanks for the hints and suggestions.
On 19.10.2011 15:40, Andy Seaborne wrote:
> Is this one model per dataset?
Yes, we use a unique model for each dataset and a unique dataset for
each graph.
> You might try:
>
> graph.getDataset().close()
Yes, this helps. Thank you :)
> You might also consider putting all the models in one dataset as named
> graphs. The set of file handles is per-dataset.
That's a good point. We'll migrate to this soon.
Thanks again
André
--
Dr. André Lanka * 0178 / 134 44 47 * http://dr-lanka.de
Re: File handles remain open
Posted by Andy Seaborne <an...@apache.org>.
On 19/10/11 09:37, "Dr. André Lanka" wrote:
> Hi all, :)
Hi André,
>
> We're using Jena 2.6.4, ARQ 2.8.8 and TDB 0.8.10. We use multiple TDB
> database instances in parallel. Each of it owns 40-50 file handles
> during usage.
Is this one model per dataset?
> To save resources we want to close rarely used database and re-open it
> when needed. Currently we do this (basically) by
>
> GraphTriplesTDB graph=(GraphTriplesTDB) rdfStore.getGraph()
> TDBMaker.releaseDataset(graph.getDataset());
> rdfStore.close();
>
> where rdfStore is a Model created by a
>
> rdfStore=TDBFactory.createModel(directory.getAbsolutePath())
createModel() returns a model i.e. a graph view of the dataset.
Datasets are the fundamental unit of persistent storage.
Closing a model does not close the dataset because, in general, there
can be other models over the same dataset (the system does not track
this - apps don't always close models so it won't work).
You might try:
graph.getDataset().close()
which closes the dataset if you know there is one one model per dataset.
DatasetGraphTDB.close does a TDBMaker.releaseDataset - you would not
need that.
> Unfortunately not only the file handles remain open: With each call to
> TDBFactory.createModel additional 40 file handles are opened. This ends
> up in an IOException due to too many open files.
>
> Perhaps we use the framework the wrong way... Does someone has an idea
> what the problem could be?
>
> BTW: rdfStore.isClosed() gives true (in our case) even after a call to
> rdfStore.close(). The same applies to graph.isClosed() and graph.close().
>
> Thanks in Advance
> André
You might also consider putting all the models in one dataset as named
graphs. The set of file handles is per-dataset.
Andy