You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Scott Patterson (JIRA)" <ji...@apache.org> on 2018/08/01 14:09:00 UTC
[jira] [Commented] (JENA-1581) BufferOverflowException when

    [ https://issues.apache.org/jira/browse/JENA-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16565378#comment-16565378 ] 

Scott Patterson commented on JENA-1581:
---------------------------------------

I will try to put together a small example to reproduce this.

In the meantime I tested out a small change to org.apache.jena.tdb.lib.NodeLib.encodeStore() to always allocate a new buffer and the problem went away. Could this be a problem with NodeLib using a bytebuffer in a single threaded fashion?
{code:java}
//throw BufferOverflowException
final private static ByteBuffer workspace = ByteBuffer.allocate(SIZE);
public static long encodeStore(Node node, ObjectFile file) {
  int maxSize = nodec.maxSize(node);
  ByteBuffer bb = workspace;
  if ( maxSize >= SIZE )
    // Large object. Special buffer.
    bb = ByteBuffer.allocate(maxSize);
  else
    bb.clear();
  int len = nodec.encode(node, bb, null);
  long x = file.write(bb);
  return x;
}

//no problem
public static long encodeStore(Node node, ObjectFile file) {
  int maxSize = nodec.maxSize(node);
  ByteBuffer bb = ByteBuffer.allocate(maxSize);
  int len = nodec.encode(node, bb, null);
  long x = file.write(bb);
  return x;
}
{code}
 

But that also made me wonder if my application is not using the api in a thread safe manner. 

Upon startup my application does:

 
{code:java}
String directory = "/path/to/tdb";
StoreConnection storeConn = StoreConnection.make(directory);
{code}
And during shutdown it does

 

 
{code:java}
storeConn.getBaseDataset().close();
StoreConnection.release(Location.create(directory));{code}
 

 

While the application is running I only create 1 writer thread at a time and it is using the TDB like this:

 
{code:java}
Dataset ds = TDBFactory.createDataset(directory);
try {
  ds.begin(ReadWrite.WRITE);
  try {
    //add graphs
  } finally {
    ds.end();
  }
} finally {
  ds.close();
}{code}
but the application may have at least two other reading threads using the TDB like this:

 

 
{code:java}
Dataset ds = TDBFactory.createDataset(directory);
try {
  ds.begin(ReadWrite.READ);
  try {
    //
  } finally {
    ds.end();
  }
} finally {
  ds.close();
}
{code}
 

Is this usage correct? I'm wondering if the reason the NodeLib is tripping up is because multiple threads are using the private static ByteBuffer.

 

> BufferOverflowException when
> ----------------------------
>
>                 Key: JENA-1581
>                 URL: https://issues.apache.org/jira/browse/JENA-1581
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: TDB
>         Environment: Jena 3.8
>            Reporter: Scott Patterson
>            Priority: Major
>
> I can not add a small number of named graphs (~1000) to the TDB without encountering a BufferOverflowException. I have a single thread that adds all the graphs and I get lots of these exceptions with different named graphs each time. Any thoughts?
> {noformat}
> Caused by: java.nio.BufferOverflowException
>  at java.nio.Buffer.nextPutIndex(Buffer.java:532)
>  at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:169)
>  at org.apache.jena.atlas.io.BlockUTF8.fromCharsBuffer(BlockUTF8.java:172)
>  at org.apache.jena.atlas.io.BlockUTF8.fromChars(BlockUTF8.java:86)
>  at org.apache.jena.atlas.io.BlockUTF8.fromChars(BlockUTF8.java:230)
>  at org.apache.jena.tdb.store.nodetable.NodecSSE.encode(NodecSSE.java:89)
>  at org.apache.jena.tdb.lib.NodeLib.encodeStore(NodeLib.java:74)
>  at org.apache.jena.tdb.store.nodetable.NodeTableNative.writeNodeToTable(NodeTableNative.java:175)
>  at org.apache.jena.tdb.store.nodetable.NodeTableNative.accessIndex(NodeTableNative.java:152)
>  at org.apache.jena.tdb.store.nodetable.NodeTableNative._idForNode(NodeTableNative.java:125)
>  at org.apache.jena.tdb.store.nodetable.NodeTableNative.getAllocateNodeId(NodeTableNative.java:79)
>  at org.apache.jena.tdb.store.nodetable.NodeTableCache._idForNode(NodeTableCache.java:152)
>  at org.apache.jena.tdb.store.nodetable.NodeTableCache.getAllocateNodeId(NodeTableCache.java:91)
>  at org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getAllocateNodeId(NodeTableWrapper.java:40)
>  at org.apache.jena.tdb.store.nodetable.NodeTableInline.getAllocateNodeId(NodeTableInline.java:51)
>  at org.apache.jena.tdb.store.nodetupletable.NodeTupleTableConcrete.addRow(NodeTupleTableConcrete.java:85)
>  at org.apache.jena.tdb.store.QuadTable.add(QuadTable.java:60)
>  at org.apache.jena.tdb.store.DatasetGraphTDB.addToNamedGraph(DatasetGraphTDB.java:97)
>  at org.apache.jena.sparql.core.DatasetGraphTriplesQuads.add(DatasetGraphTriplesQuads.java:44)
>  at org.apache.jena.sparql.core.GraphView.performAdd(GraphView.java:152)
>  at org.apache.jena.graph.GraphUtil$$Lambda$145.0000000020B643F0.accept(Unknown Source)
>  at java.util.ArrayList$Itr.forEachRemaining(ArrayList.java:910)
>  at org.apache.jena.graph.GraphUtil.addIteratorWorkerDirect(GraphUtil.java:151)
>  at org.apache.jena.graph.GraphUtil.addIteratorWorker(GraphUtil.java:145)
>  at org.apache.jena.graph.GraphUtil.addInto(GraphUtil.java:139)
>  at org.apache.jena.sparql.core.DatasetGraphTriplesQuads.addGraph(DatasetGraphTriplesQuads.java:80)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)