You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by "Dr. André Lanka" <ma...@dr-lanka.de> on 2011/10/19 10:37:28 UTC

File handles remain open

Hi all, :)

We're using Jena 2.6.4, ARQ 2.8.8 and TDB 0.8.10. We use multiple TDB 
database instances in parallel. Each of it owns 40-50 file handles 
during usage.

To save resources we want to close rarely used database and re-open it 
when needed. Currently we do this (basically) by

GraphTriplesTDB graph=(GraphTriplesTDB) rdfStore.getGraph()
TDBMaker.releaseDataset(graph.getDataset());
rdfStore.close();

where rdfStore is a Model created by a

rdfStore=TDBFactory.createModel(directory.getAbsolutePath())

Unfortunately not only the file handles remain open: With each call to 
TDBFactory.createModel additional 40 file handles are opened. This ends 
up in an IOException due to too many open files.

Perhaps we use the framework the wrong way... Does someone has an idea 
what the problem could be?

BTW: rdfStore.isClosed() gives true (in our case) even after a call to 
rdfStore.close(). The same applies to graph.isClosed() and graph.close().

Thanks in Advance
André


-- 
Dr. André Lanka  *  0178 / 134 44 47  *  http://dr-lanka.de

Re: File handles remain open

Posted by "Dr. André Lanka" <ma...@dr-lanka.de>.
Hi Paolo,

thanks for your prompt answer.

On 19.10.2011 11:43, Paolo Castagna wrote:
 > which operating system are you using?

we can reproduce this on various Linux distributions and on a MacOS 10.6.8.
Perhaps it helps:
If we call rdfStore.getGraph().close() additionally some (but not all) 
file handles are closed. Yet in this case we have other issues (after 
re-opening the database using createModel(...)).

Thanks
André

-- 
Dr. André Lanka  *  0178 / 134 44 47  *  http://dr-lanka.de

Re: File handles remain open

Posted by Paolo Castagna <ca...@googlemail.com>.
Dr. André Lanka wrote:
> Hi all, :)
> 
> We're using Jena 2.6.4, ARQ 2.8.8 and TDB 0.8.10. We use multiple TDB 
> database instances in parallel. Each of it owns 40-50 file handles 
> during usage.

Hi André,
which operating system are you using?

Paolo

> 
> To save resources we want to close rarely used database and re-open it 
> when needed. Currently we do this (basically) by
> 
> GraphTriplesTDB graph=(GraphTriplesTDB) rdfStore.getGraph()
> TDBMaker.releaseDataset(graph.getDataset());
> rdfStore.close();
> 
> where rdfStore is a Model created by a
> 
> rdfStore=TDBFactory.createModel(directory.getAbsolutePath())
> 
> Unfortunately not only the file handles remain open: With each call to 
> TDBFactory.createModel additional 40 file handles are opened. This ends 
> up in an IOException due to too many open files.
> 
> Perhaps we use the framework the wrong way... Does someone has an idea 
> what the problem could be?
> 
> BTW: rdfStore.isClosed() gives true (in our case) even after a call to 
> rdfStore.close(). The same applies to graph.isClosed() and graph.close().
> 
> Thanks in Advance
> André
> 
> 


Re: File handles remain open

Posted by "Dr. André Lanka" <ma...@dr-lanka.de>.
Hi Paolo,

On 26.10.2011 12:15, Paolo Castagna wrote:
> You can explicitly call System.gc(), but the JVM will run the GC anyway
> every once in a while... that will close the file handles (even without
> you explicitly call System.gc()), isn't it?

Yes, System.gc() takes place now and then. Unfortunately I was able to 
leave more than 2000 file handles open within one hour (moderate usage 
of our platform). I could imagine that we'll encounter problems with 
open file handles as soon as thousands of users will use it.

Anyway, the framework is great and I'm pretty sure that people usually 
do not experience any problems, especially when using one graph with 
named subgraphs.
I just wanted to share the information for the small number of users 
wondering about "Too many open files" exceptions even if they close the 
graph.


> Have you tried not to call System.gc() and monitor your app with a
> profiler or jconsole which will show when the GC is executed and check
> file handlers are closed?

Nope, I only saw that some of the file handles are not closed and remain 
open for a very long time.
I think, the reason is that there is enough memory for the VM and gc() 
is rarely required (in our case).


Greetings
André

-- 
Dr. André Lanka  *  0178 / 134 44 47  *  http://dr-lanka.de

Re: File handles remain open

Posted by Paolo Castagna <ca...@googlemail.com>.
Dr. André Lanka wrote:
> Hello Jena users,
> 
> perhaps you'll find the following information useful, perhaps not :)
> 
> Problem:
> A few days ago I asked for ideas why TDB has open file handles even if 
> one closed the model. This is a crucial point in case you are using 
> plenty of graphs. A call to graph.getDataSet().close() helped to close 
> most of the handles, but some unused file handles still remain. In long 
> running instances (doing re-open graphs often) this can be a problem.
> 
> Cause:
> Today I dived into Jena's source code and found in BlockMgrMapped that 
> the database files are accessed via memory mapped buffers. As Java seems 
> to lack in a possibility to unmap the buffer directly, file handles 
> remain open.
> 
> Solution:
> The only possibility I found (without changing anything in the Jena 
> framework) was to explicitly call System.gc() after dataSet.close(). 

Hi André,
thanks for sharing your findings on jena-users (even if are known, it
is always good to share what we learn from experience with others).

(By the way, with Windows things are even worst. ;-))

You can explicitly call System.gc(), but the JVM will run the GC anyway
every once in a while... that will close the file handles (even without
you explicitly call System.gc()), isn't it?

Have you tried not to call System.gc() and monitor your app with a
profiler or jconsole which will show when the GC is executed and check
file handlers are closed?

Paolo

> Usually this closes the file handles (because Jena nulled all the 
> buffers before [bless the Jena group for this :) ]).
> Of course, System.gc() is only an advice to the VM, but it seems to be 
> successful.
> 
> 
> Thanks
> André
> 


Re: File handles remain open

Posted by "Dr. André Lanka" <ma...@dr-lanka.de>.
Hello Jena users,

perhaps you'll find the following information useful, perhaps not :)

Problem:
A few days ago I asked for ideas why TDB has open file handles even if 
one closed the model. This is a crucial point in case you are using 
plenty of graphs. A call to graph.getDataSet().close() helped to close 
most of the handles, but some unused file handles still remain. In long 
running instances (doing re-open graphs often) this can be a problem.

Cause:
Today I dived into Jena's source code and found in BlockMgrMapped that 
the database files are accessed via memory mapped buffers. As Java seems 
to lack in a possibility to unmap the buffer directly, file handles 
remain open.

Solution:
The only possibility I found (without changing anything in the Jena 
framework) was to explicitly call System.gc() after dataSet.close(). 
Usually this closes the file handles (because Jena nulled all the 
buffers before [bless the Jena group for this :) ]).
Of course, System.gc() is only an advice to the VM, but it seems to be 
successful.


Thanks
André

-- 
Dr. André Lanka  *  0178 / 134 44 47  *  http://dr-lanka.de

Re: File handles remain open

Posted by "Dr. André Lanka" <ma...@dr-lanka.de>.
Hi Andy,

thanks for the hints and suggestions.

On 19.10.2011 15:40, Andy Seaborne wrote:
> Is this one model per dataset?

Yes, we use a unique model for each dataset and a unique dataset for 
each graph.


> You might try:
>
> graph.getDataset().close()

Yes, this helps. Thank you :)


> You might also consider putting all the models in one dataset as named
> graphs. The set of file handles is per-dataset.

That's a good point. We'll migrate to this soon.


Thanks again
André

-- 
Dr. André Lanka  *  0178 / 134 44 47  *  http://dr-lanka.de

Re: File handles remain open

Posted by Andy Seaborne <an...@apache.org>.
On 19/10/11 09:37, "Dr. André Lanka" wrote:
> Hi all, :)

Hi André,

>
> We're using Jena 2.6.4, ARQ 2.8.8 and TDB 0.8.10. We use multiple TDB
> database instances in parallel. Each of it owns 40-50 file handles
> during usage.

Is this one model per dataset?

> To save resources we want to close rarely used database and re-open it
> when needed. Currently we do this (basically) by
>
> GraphTriplesTDB graph=(GraphTriplesTDB) rdfStore.getGraph()
> TDBMaker.releaseDataset(graph.getDataset());
> rdfStore.close();
>
> where rdfStore is a Model created by a
>
> rdfStore=TDBFactory.createModel(directory.getAbsolutePath())

createModel() returns a model i.e. a graph view of the dataset.

Datasets are the fundamental unit of persistent storage.

Closing a model does not close the dataset because, in general, there 
can be other models over the same dataset (the system does not track 
this - apps don't always close models so it won't work).

You might try:

   graph.getDataset().close()

which closes the dataset if you know there is one one model per dataset.

DatasetGraphTDB.close does a TDBMaker.releaseDataset - you would not 
need that.


> Unfortunately not only the file handles remain open: With each call to
> TDBFactory.createModel additional 40 file handles are opened. This ends
> up in an IOException due to too many open files.
>
> Perhaps we use the framework the wrong way... Does someone has an idea
> what the problem could be?
>
> BTW: rdfStore.isClosed() gives true (in our case) even after a call to
> rdfStore.close(). The same applies to graph.isClosed() and graph.close().
>
> Thanks in Advance
> André

You might also consider putting all the models in one dataset as named 
graphs.  The set of file handles is per-dataset.

	Andy