You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Al Baker <aj...@gmail.com> on 2011/07/18 08:17:35 UTC

TDB ObjectFileStorage Error

Hi All,

I've noticed after creating a number of triples, eventually all TDB calls
fail around.  Some of of the objects stored are large strings (e.g. several
k).

java.lang.OutOfMemoryError: Java heap space
    at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
    at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
    at
com.hp.hpl.jena.tdb.base.objectfile.ObjectFileStorage.read(ObjectFileStorage.java:249)
    at com.hp.hpl.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:60)

This fails on 0.8.10 or the 0.9 TxTDB.

Al

Re: TDB ObjectFileStorage Error

Posted by Andy Seaborne <an...@epimorphics.com>.

On 18/07/11 15:32, Al Baker wrote:
> Hi Andy,
>
> It's 32-bit -- no matter how much memory I feed the JVM, this always comes
> up. At first, I thought I wasn't giving it enough RAM - but it was only 80k
> triples and no amount of memory seemed enough.  That's when I realized this
> particular program was "RDFizing" web content - in this case blog posts.  So
> the literals could be quite large, e.g. several kilobytes.
>
> I'll see if I can create a simple test case to reproduce the issue.
>
> On a related note, I just tried TDB on a 32bit and a 64bit 512 Linode -
> perhaps because of this issue, perhaps the app uses too much memory as it
> is.  Either way, in both instances as that VM approaches complete memory
> usage on the server and it starts to swap - everything slows down to be
> completely unusable.  At first I thought TDB 64-bit would help, as the
> memory-mapped files surely wouldn't wind up grinding the swap, but it only
> appeared to get to maximum memory usage faster than when on a 32bit system.
> It would be highly desirable if there was a "disable caching" for low-memory
> environments.
>
> I started poking around the TDB source tree last night, but nothing jumped
> out as where the caching would be - can you point me in the right direction
> here?

Al,

Not a easy (indeed, possible) as it should be to set the caching sizes.

Good news is that as part of TxTDB I've had to revisit how datasets get 
built and making parameter setting sensibly possible is one thing I want 
to do.

Bad news is it isn't done.

Currently, pre-Tx it's supposed to work but it's not tested.  Dataset 
creation is done in SetupTDB taking constants from SystemTDB as below. 
Ther is an undocumented properties file that should set the values.

In TxTDB, DatasetBuilderStd will do all building.  Currently, it only 
builds for transaction usage via StoreConnection - the old way is still 
via SetupTDB and any built datasets ejected from the TDBFactory cache if 
you use them transactionally..

DatasetBuilderStd uses a Params object.  I have just checked in a 
version remembering to take params from node id caching from SystemTDB.

You have to set the

SystemTDB.Node2NodeIdCacheSize
SystemTDB.NodeId2NodeCacheSize

for the node caches and for 32 bit machines:
SystemTDB.BlockWriteCacheSize
SystemTDB.BlockReadCacheSize

If you are on a small machine, then 32 bit mode is probably better 
anyway.  Mapped files are very keen on taking up the whole machine to 
the performance-exclusion of anything else.

As ever in Java, the heap size should be less than the real RAM size or 
swap death will occur, as you have found out.

	Andy

Re: TDB ObjectFileStorage Error

Posted by Al Baker <aj...@gmail.com>.
Hi Andy,

It's 32-bit -- no matter how much memory I feed the JVM, this always comes
up. At first, I thought I wasn't giving it enough RAM - but it was only 80k
triples and no amount of memory seemed enough.  That's when I realized this
particular program was "RDFizing" web content - in this case blog posts.  So
the literals could be quite large, e.g. several kilobytes.

I'll see if I can create a simple test case to reproduce the issue.

On a related note, I just tried TDB on a 32bit and a 64bit 512 Linode -
perhaps because of this issue, perhaps the app uses too much memory as it
is.  Either way, in both instances as that VM approaches complete memory
usage on the server and it starts to swap - everything slows down to be
completely unusable.  At first I thought TDB 64-bit would help, as the
memory-mapped files surely wouldn't wind up grinding the swap, but it only
appeared to get to maximum memory usage faster than when on a 32bit system.
It would be highly desirable if there was a "disable caching" for low-memory
environments.

I started poking around the TDB source tree last night, but nothing jumped
out as where the caching would be - can you point me in the right direction
here?

Al

On Mon, Jul 18, 2011 at 4:10 AM, Andy Seaborne <
andy.seaborne@epimorphics.com> wrote:

>
>
> On 18/07/11 07:17, Al Baker wrote:
>
>> Hi All,
>>
>> I've noticed after creating a number of triples, eventually all TDB calls
>> fail around.  Some of of the objects stored are large strings (e.g.
>> several
>> k).
>>
>> java.lang.OutOfMemoryError: Java heap space
>>     at java.nio.HeapByteBuffer.<init>**(HeapByteBuffer.java:39)
>>     at java.nio.ByteBuffer.allocate(**ByteBuffer.java:312)
>>     at
>> com.hp.hpl.jena.tdb.base.**objectfile.ObjectFileStorage.**
>> read(ObjectFileStorage.java:**249)
>>     at com.hp.hpl.jena.tdb.lib.**NodeLib.fetchDecode(NodeLib.**java:60)
>>
>> This fails on 0.8.10 or the 0.9 TxTDB.
>>
>> Al
>>
>>
> Hi Al,
>
> How much RAM are you giving the process?  32-bit or 64-bit JVM?
>
> Many very large stings might cause the node caching to use an excessive
> about of memory (this area has not changed between 0.8. and 0.9).
>
> Aside spaces sides, it might be better to store large strings in a separate
> file space and point to them.  (Maybe TDB should do that automatically.)
>
>        Andy
>

Re: TDB ObjectFileStorage Error

Posted by Andy Seaborne <an...@epimorphics.com>.

On 18/07/11 07:17, Al Baker wrote:
> Hi All,
>
> I've noticed after creating a number of triples, eventually all TDB calls
> fail around.  Some of of the objects stored are large strings (e.g. several
> k).
>
> java.lang.OutOfMemoryError: Java heap space
>      at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>      at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>      at
> com.hp.hpl.jena.tdb.base.objectfile.ObjectFileStorage.read(ObjectFileStorage.java:249)
>      at com.hp.hpl.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:60)
>
> This fails on 0.8.10 or the 0.9 TxTDB.
>
> Al
>

Hi Al,

How much RAM are you giving the process?  32-bit or 64-bit JVM?

Many very large stings might cause the node caching to use an excessive 
about of memory (this area has not changed between 0.8. and 0.9).

Aside spaces sides, it might be better to store large strings in a 
separate file space and point to them.  (Maybe TDB should do that 
automatically.)

	Andy