You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Adeeb Noor <ad...@colorado.edu> on 2014/02/09 22:42:06 UTC

Large TDB + Sparql Updates

Hi All:

I have a tdb dataset with almost 1g. I am trying to modify with using
Sparql update, however I get this error:

ERROR TDB                       ::
ObjectFileStorage.read[nodes.dat](65419578)[filesize=103476978][file.size()=103476978]:
Impossibly large object : 893792824 bytes >
filesize-(loc+SizeOfInt)=38057396

14:33:42 ERROR BindingTDB                :: get1(?s)

com.hp.hpl.jena.tdb.base.file.FileException:
ObjectFileStorage.read[nodes.dat](65419578)[filesize=103476978][file.size()=103476978]:
Impossibly large object : 893792824 bytes >
filesize-(loc+SizeOfInt)=38057396

at com.hp.hpl.jena.tdb.base.objectfile.ObjectFileStorage.read(
ObjectFileStorage.java:346)

at com.hp.hpl.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:78)

at com.hp.hpl.jena.tdb.nodetable.NodeTableNative.readNodeFromTable(
NodeTableNative.java:178)

at com.hp.hpl.jena.tdb.nodetable.NodeTableNative._retrieveNodeByNodeId(
NodeTableNative.java:103)

at com.hp.hpl.jena.tdb.nodetable.NodeTableNative.getNodeForNodeId(
NodeTableNative.java:74)

at com.hp.hpl.jena.tdb.nodetable.NodeTableCache._retrieveNodeByNodeId(
NodeTableCache.java:103)

at com.hp.hpl.jena.tdb.nodetable.NodeTableCache.getNodeForNodeId(
NodeTableCache.java:74)

at com.hp.hpl.jena.tdb.nodetable.NodeTableWrapper.getNodeForNodeId(
NodeTableWrapper.java:55)

at com.hp.hpl.jena.tdb.nodetable.NodeTableInline.getNodeForNodeId(
NodeTableInline.java:67)

at com.hp.hpl.jena.tdb.solver.BindingTDB.get1(BindingTDB.java:123)

at com.hp.hpl.jena.sparql.engine.binding.BindingBase.get(
BindingBase.java:123)

at com.hp.hpl.jena.sparql.core.Var.lookup(Var.java:84)

at com.hp.hpl.jena.sparql.core.Var.lookup(Var.java:79)

at com.hp.hpl.jena.sparql.core.Substitute.substitute(Substitute.java:123)

at com.hp.hpl.jena.sparql.core.Substitute.substitute(Substitute.java:110)

at com.hp.hpl.jena.sparql.modify.TemplateLib.subst(TemplateLib.java:158)

at com.hp.hpl.jena.sparql.modify.TemplateLib$3.convert(TemplateLib.java:115)

at com.hp.hpl.jena.sparql.modify.TemplateLib$3.convert(TemplateLib.java:104)

at org.apache.jena.atlas.iterator.Iter$5.hasNext(Iter.java:337)

at com.hp.hpl.jena.sparql.modify.UpdateEngineWorker.execDelete(
UpdateEngineWorker.java:496)

at com.hp.hpl.jena.sparql.modify.UpdateEngineWorker.visit(
UpdateEngineWorker.java:418)

at com.hp.hpl.jena.sparql.modify.request.UpdateModify.visit(
UpdateModify.java:97)

at com.hp.hpl.jena.sparql.modify.UpdateVisitorSink.send(
UpdateVisitorSink.java:46)

at com.hp.hpl.jena.sparql.modify.UpdateVisitorSink.send(
UpdateVisitorSink.java:26)

at org.apache.jena.atlas.iterator.Iter.sendToSink(Iter.java:578)

at org.apache.jena.atlas.iterator.Iter.sendToSink(Iter.java:586)

at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(
UpdateProcessorBase.java:57)

at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:190)

at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:181)

at com.hp.hpl.jena.update.UpdateAction.parseExecute(UpdateAction.java:135)

at com.hp.hpl.jena.update.UpdateAction.parseExecute(UpdateAction.java:106)

What should I do ?
-- 
Adeeb Noor
Ph.D. Candidate
Dept of Computer Science
University of Colorado at Boulder
Cell: 571-484-3303
Email: Adeeb.noor@colorado.edu

Re: Large TDB + Sparql Updates

Posted by Andy Seaborne <an...@apache.org>.
On 11/02/14 01:19, Adeeb Noor wrote:
> Hi Andy:
>
> Thank you for the answer. I can query my database from the tdb before I use
> the sprql update. I believe the problem is coming from the sparql update
> itself.

Sorry - it's not.  The symptom and the cause are in different places.

The update is touching (because of the WHERE { ?s ?p ?o }) and 
materializing every subject in the database and that happens to include 
a node that has not touched since the crash.  The cause is the crash/JVM 
exit.

> here is the code for sparql update:
>
> System.out.println(" ... offsides Sparql Updates AFTER ...");
>
>    data.startQuery();
>
>    String query3 = data.loadQuery("src/offisdesTripleSparqlafterUpdates.tql"
> );
>
>    data.endQuery();
>
> UpdateAction.parseExecute(query3, data.tdb);
>
> and here is the sparql update:
>
> DELETE { ?s ?p ?o .}
>
> INSERT{ ?dse ?p ?o .}
>
> WHERE {
>
> ?s ?p ?o .
>
> FILTER (contains(str(?s), "http://bio2rdf.org/offsides:"))
>
> BIND(uri(CONCAT("
> https://csel.cs.colorado.edu/~noor/Drug_Disease_ontology/DDID.rdf#os_",
> replace(str(?s), "http://bio2rdf.org/offsides:" , ""))) AS ?dse)};
>
> What do you think ? also, would you provide me with an example of using
> sync().

dataset.begin(ReadWrite.WRITE) ;
try {
  ....
  dataset.commit() ;
} finally {
   dataset.end() ;
}

http://jena.apache.org/documentation/tdb/tdb_transactions.html

	Andy

>
> Thanks
>
>
> On Mon, Feb 10, 2014 at 9:26 AM, Andy Seaborne <an...@apache.org> wrote:
>
>> On 09/02/14 21:42, Adeeb Noor wrote:
>>
>>> Hi All:
>>>
>>> I have a tdb dataset with almost 1g. I am trying to modify with using
>>> Sparql update, however I get this error:
>>>
>>> ERROR TDB                       ::
>>> ObjectFileStorage.read[nodes.dat](65419578)[filesize=
>>> 103476978][file.size()=103476978]:
>>> Impossibly large object : 893792824 bytes >
>>> filesize-(loc+SizeOfInt)=38057396
>>>
>>> 14:33:42 ERROR BindingTDB                :: get1(?s)
>>>
>>> com.hp.hpl.jena.tdb.base.file.FileException:
>>> ObjectFileStorage.read[nodes.dat](65419578)[filesize=
>>> 103476978][file.size()=103476978]:
>>> Impossibly large object : 893792824 bytes >
>>> filesize-(loc+SizeOfInt)=38057396
>>>
>>
>> I'm afraid it looks like the dataset is corrupt.  This happened sometime
>> in the past, probably due to the JVM crash or exiting without a sync()
>> being called.
>>
>>  From the stacktrace, I don't see any sign you are using transactions. One
>> of things transactions gives you is resistance to crashes.  Used directly,
>> the code assumes you will cleanly exit the JVM - no crashes, no exit
>> without calling sync().
>>
>>          Andy
>>
>>
>>> at com.hp.hpl.jena.tdb.base.objectfile.ObjectFileStorage.read(
>>>
>>> ObjectFileStorage.java:346)
>>>
>>> at com.hp.hpl.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:78)
>>>
>>> at com.hp.hpl.jena.tdb.nodetable.NodeTableNative.readNodeFromTable(
>>> NodeTableNative.java:178)
>>>
>>> at com.hp.hpl.jena.tdb.nodetable.NodeTableNative._retrieveNodeByNodeId(
>>> NodeTableNative.java:103)
>>>
>>> at com.hp.hpl.jena.tdb.nodetable.NodeTableNative.getNodeForNodeId(
>>> NodeTableNative.java:74)
>>>
>>> at com.hp.hpl.jena.tdb.nodetable.NodeTableCache._retrieveNodeByNodeId(
>>> NodeTableCache.java:103)
>>>
>>> at com.hp.hpl.jena.tdb.nodetable.NodeTableCache.getNodeForNodeId(
>>> NodeTableCache.java:74)
>>>
>>> at com.hp.hpl.jena.tdb.nodetable.NodeTableWrapper.getNodeForNodeId(
>>> NodeTableWrapper.java:55)
>>>
>>> at com.hp.hpl.jena.tdb.nodetable.NodeTableInline.getNodeForNodeId(
>>> NodeTableInline.java:67)
>>>
>>> at com.hp.hpl.jena.tdb.solver.BindingTDB.get1(BindingTDB.java:123)
>>>
>>> at com.hp.hpl.jena.sparql.engine.binding.BindingBase.get(
>>> BindingBase.java:123)
>>>
>>> at com.hp.hpl.jena.sparql.core.Var.lookup(Var.java:84)
>>>
>>> at com.hp.hpl.jena.sparql.core.Var.lookup(Var.java:79)
>>>
>>> at com.hp.hpl.jena.sparql.core.Substitute.substitute(Substitute.java:123)
>>>
>>> at com.hp.hpl.jena.sparql.core.Substitute.substitute(Substitute.java:110)
>>>
>>> at com.hp.hpl.jena.sparql.modify.TemplateLib.subst(TemplateLib.java:158)
>>>
>>> at com.hp.hpl.jena.sparql.modify.TemplateLib$3.convert(
>>> TemplateLib.java:115)
>>>
>>> at com.hp.hpl.jena.sparql.modify.TemplateLib$3.convert(
>>> TemplateLib.java:104)
>>>
>>> at org.apache.jena.atlas.iterator.Iter$5.hasNext(Iter.java:337)
>>>
>>> at com.hp.hpl.jena.sparql.modify.UpdateEngineWorker.execDelete(
>>> UpdateEngineWorker.java:496)
>>>
>>> at com.hp.hpl.jena.sparql.modify.UpdateEngineWorker.visit(
>>> UpdateEngineWorker.java:418)
>>>
>>> at com.hp.hpl.jena.sparql.modify.request.UpdateModify.visit(
>>> UpdateModify.java:97)
>>>
>>> at com.hp.hpl.jena.sparql.modify.UpdateVisitorSink.send(
>>> UpdateVisitorSink.java:46)
>>>
>>> at com.hp.hpl.jena.sparql.modify.UpdateVisitorSink.send(
>>> UpdateVisitorSink.java:26)
>>>
>>> at org.apache.jena.atlas.iterator.Iter.sendToSink(Iter.java:578)
>>>
>>> at org.apache.jena.atlas.iterator.Iter.sendToSink(Iter.java:586)
>>>
>>> at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(
>>> UpdateProcessorBase.java:57)
>>>
>>> at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:190)
>>>
>>> at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:181)
>>>
>>> at com.hp.hpl.jena.update.UpdateAction.parseExecute(
>>> UpdateAction.java:135)
>>>
>>> at com.hp.hpl.jena.update.UpdateAction.parseExecute(
>>> UpdateAction.java:106)
>>>
>>> What should I do ?
>>>
>>>
>>
>
>


Re: Large TDB + Sparql Updates

Posted by Adeeb Noor <ad...@colorado.edu>.
Hi Andy:

Thank you for the answer. I can query my database from the tdb before I use
the sprql update. I believe the problem is coming from the sparql update
itself.

here is the code for sparql update:

System.out.println(" ... offsides Sparql Updates AFTER ...");

  data.startQuery();

  String query3 = data.loadQuery("src/offisdesTripleSparqlafterUpdates.tql"
);

  data.endQuery();

UpdateAction.parseExecute(query3, data.tdb);

and here is the sparql update:

DELETE { ?s ?p ?o .}

INSERT{ ?dse ?p ?o .}

WHERE {

?s ?p ?o .

FILTER (contains(str(?s), "http://bio2rdf.org/offsides:"))

BIND(uri(CONCAT("
https://csel.cs.colorado.edu/~noor/Drug_Disease_ontology/DDID.rdf#os_",
replace(str(?s), "http://bio2rdf.org/offsides:" , ""))) AS ?dse)};

What do you think ? also, would you provide me with an example of using
sync().

Thanks


On Mon, Feb 10, 2014 at 9:26 AM, Andy Seaborne <an...@apache.org> wrote:

> On 09/02/14 21:42, Adeeb Noor wrote:
>
>> Hi All:
>>
>> I have a tdb dataset with almost 1g. I am trying to modify with using
>> Sparql update, however I get this error:
>>
>> ERROR TDB                       ::
>> ObjectFileStorage.read[nodes.dat](65419578)[filesize=
>> 103476978][file.size()=103476978]:
>> Impossibly large object : 893792824 bytes >
>> filesize-(loc+SizeOfInt)=38057396
>>
>> 14:33:42 ERROR BindingTDB                :: get1(?s)
>>
>> com.hp.hpl.jena.tdb.base.file.FileException:
>> ObjectFileStorage.read[nodes.dat](65419578)[filesize=
>> 103476978][file.size()=103476978]:
>> Impossibly large object : 893792824 bytes >
>> filesize-(loc+SizeOfInt)=38057396
>>
>
> I'm afraid it looks like the dataset is corrupt.  This happened sometime
> in the past, probably due to the JVM crash or exiting without a sync()
> being called.
>
> From the stacktrace, I don't see any sign you are using transactions. One
> of things transactions gives you is resistance to crashes.  Used directly,
> the code assumes you will cleanly exit the JVM - no crashes, no exit
> without calling sync().
>
>         Andy
>
>
>> at com.hp.hpl.jena.tdb.base.objectfile.ObjectFileStorage.read(
>>
>> ObjectFileStorage.java:346)
>>
>> at com.hp.hpl.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:78)
>>
>> at com.hp.hpl.jena.tdb.nodetable.NodeTableNative.readNodeFromTable(
>> NodeTableNative.java:178)
>>
>> at com.hp.hpl.jena.tdb.nodetable.NodeTableNative._retrieveNodeByNodeId(
>> NodeTableNative.java:103)
>>
>> at com.hp.hpl.jena.tdb.nodetable.NodeTableNative.getNodeForNodeId(
>> NodeTableNative.java:74)
>>
>> at com.hp.hpl.jena.tdb.nodetable.NodeTableCache._retrieveNodeByNodeId(
>> NodeTableCache.java:103)
>>
>> at com.hp.hpl.jena.tdb.nodetable.NodeTableCache.getNodeForNodeId(
>> NodeTableCache.java:74)
>>
>> at com.hp.hpl.jena.tdb.nodetable.NodeTableWrapper.getNodeForNodeId(
>> NodeTableWrapper.java:55)
>>
>> at com.hp.hpl.jena.tdb.nodetable.NodeTableInline.getNodeForNodeId(
>> NodeTableInline.java:67)
>>
>> at com.hp.hpl.jena.tdb.solver.BindingTDB.get1(BindingTDB.java:123)
>>
>> at com.hp.hpl.jena.sparql.engine.binding.BindingBase.get(
>> BindingBase.java:123)
>>
>> at com.hp.hpl.jena.sparql.core.Var.lookup(Var.java:84)
>>
>> at com.hp.hpl.jena.sparql.core.Var.lookup(Var.java:79)
>>
>> at com.hp.hpl.jena.sparql.core.Substitute.substitute(Substitute.java:123)
>>
>> at com.hp.hpl.jena.sparql.core.Substitute.substitute(Substitute.java:110)
>>
>> at com.hp.hpl.jena.sparql.modify.TemplateLib.subst(TemplateLib.java:158)
>>
>> at com.hp.hpl.jena.sparql.modify.TemplateLib$3.convert(
>> TemplateLib.java:115)
>>
>> at com.hp.hpl.jena.sparql.modify.TemplateLib$3.convert(
>> TemplateLib.java:104)
>>
>> at org.apache.jena.atlas.iterator.Iter$5.hasNext(Iter.java:337)
>>
>> at com.hp.hpl.jena.sparql.modify.UpdateEngineWorker.execDelete(
>> UpdateEngineWorker.java:496)
>>
>> at com.hp.hpl.jena.sparql.modify.UpdateEngineWorker.visit(
>> UpdateEngineWorker.java:418)
>>
>> at com.hp.hpl.jena.sparql.modify.request.UpdateModify.visit(
>> UpdateModify.java:97)
>>
>> at com.hp.hpl.jena.sparql.modify.UpdateVisitorSink.send(
>> UpdateVisitorSink.java:46)
>>
>> at com.hp.hpl.jena.sparql.modify.UpdateVisitorSink.send(
>> UpdateVisitorSink.java:26)
>>
>> at org.apache.jena.atlas.iterator.Iter.sendToSink(Iter.java:578)
>>
>> at org.apache.jena.atlas.iterator.Iter.sendToSink(Iter.java:586)
>>
>> at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(
>> UpdateProcessorBase.java:57)
>>
>> at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:190)
>>
>> at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:181)
>>
>> at com.hp.hpl.jena.update.UpdateAction.parseExecute(
>> UpdateAction.java:135)
>>
>> at com.hp.hpl.jena.update.UpdateAction.parseExecute(
>> UpdateAction.java:106)
>>
>> What should I do ?
>>
>>
>


-- 
Adeeb Noor
Ph.D. Candidate
Dept of Computer Science
University of Colorado at Boulder
Cell: 571-484-3303
Email: Adeeb.noor@colorado.edu

Re: Large TDB + Sparql Updates

Posted by Andy Seaborne <an...@apache.org>.
On 09/02/14 21:42, Adeeb Noor wrote:
> Hi All:
>
> I have a tdb dataset with almost 1g. I am trying to modify with using
> Sparql update, however I get this error:
>
> ERROR TDB                       ::
> ObjectFileStorage.read[nodes.dat](65419578)[filesize=103476978][file.size()=103476978]:
> Impossibly large object : 893792824 bytes >
> filesize-(loc+SizeOfInt)=38057396
>
> 14:33:42 ERROR BindingTDB                :: get1(?s)
>
> com.hp.hpl.jena.tdb.base.file.FileException:
> ObjectFileStorage.read[nodes.dat](65419578)[filesize=103476978][file.size()=103476978]:
> Impossibly large object : 893792824 bytes >
> filesize-(loc+SizeOfInt)=38057396

I'm afraid it looks like the dataset is corrupt.  This happened sometime 
in the past, probably due to the JVM crash or exiting without a sync() 
being called.

 From the stacktrace, I don't see any sign you are using transactions. 
One of things transactions gives you is resistance to crashes.  Used 
directly, the code assumes you will cleanly exit the JVM - no crashes, 
no exit without calling sync().

	Andy

>
> at com.hp.hpl.jena.tdb.base.objectfile.ObjectFileStorage.read(
> ObjectFileStorage.java:346)
>
> at com.hp.hpl.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:78)
>
> at com.hp.hpl.jena.tdb.nodetable.NodeTableNative.readNodeFromTable(
> NodeTableNative.java:178)
>
> at com.hp.hpl.jena.tdb.nodetable.NodeTableNative._retrieveNodeByNodeId(
> NodeTableNative.java:103)
>
> at com.hp.hpl.jena.tdb.nodetable.NodeTableNative.getNodeForNodeId(
> NodeTableNative.java:74)
>
> at com.hp.hpl.jena.tdb.nodetable.NodeTableCache._retrieveNodeByNodeId(
> NodeTableCache.java:103)
>
> at com.hp.hpl.jena.tdb.nodetable.NodeTableCache.getNodeForNodeId(
> NodeTableCache.java:74)
>
> at com.hp.hpl.jena.tdb.nodetable.NodeTableWrapper.getNodeForNodeId(
> NodeTableWrapper.java:55)
>
> at com.hp.hpl.jena.tdb.nodetable.NodeTableInline.getNodeForNodeId(
> NodeTableInline.java:67)
>
> at com.hp.hpl.jena.tdb.solver.BindingTDB.get1(BindingTDB.java:123)
>
> at com.hp.hpl.jena.sparql.engine.binding.BindingBase.get(
> BindingBase.java:123)
>
> at com.hp.hpl.jena.sparql.core.Var.lookup(Var.java:84)
>
> at com.hp.hpl.jena.sparql.core.Var.lookup(Var.java:79)
>
> at com.hp.hpl.jena.sparql.core.Substitute.substitute(Substitute.java:123)
>
> at com.hp.hpl.jena.sparql.core.Substitute.substitute(Substitute.java:110)
>
> at com.hp.hpl.jena.sparql.modify.TemplateLib.subst(TemplateLib.java:158)
>
> at com.hp.hpl.jena.sparql.modify.TemplateLib$3.convert(TemplateLib.java:115)
>
> at com.hp.hpl.jena.sparql.modify.TemplateLib$3.convert(TemplateLib.java:104)
>
> at org.apache.jena.atlas.iterator.Iter$5.hasNext(Iter.java:337)
>
> at com.hp.hpl.jena.sparql.modify.UpdateEngineWorker.execDelete(
> UpdateEngineWorker.java:496)
>
> at com.hp.hpl.jena.sparql.modify.UpdateEngineWorker.visit(
> UpdateEngineWorker.java:418)
>
> at com.hp.hpl.jena.sparql.modify.request.UpdateModify.visit(
> UpdateModify.java:97)
>
> at com.hp.hpl.jena.sparql.modify.UpdateVisitorSink.send(
> UpdateVisitorSink.java:46)
>
> at com.hp.hpl.jena.sparql.modify.UpdateVisitorSink.send(
> UpdateVisitorSink.java:26)
>
> at org.apache.jena.atlas.iterator.Iter.sendToSink(Iter.java:578)
>
> at org.apache.jena.atlas.iterator.Iter.sendToSink(Iter.java:586)
>
> at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(
> UpdateProcessorBase.java:57)
>
> at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:190)
>
> at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:181)
>
> at com.hp.hpl.jena.update.UpdateAction.parseExecute(UpdateAction.java:135)
>
> at com.hp.hpl.jena.update.UpdateAction.parseExecute(UpdateAction.java:106)
>
> What should I do ?
>