You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Ján Mojžiš <ja...@gmail.com> on 2011/09/23 11:16:43 UTC

out of heap while performing query

Hi there,
I would appreciate any help with my problem. In jena tdb I query TDB datset
(very large) with:

PREFIX vcard:    <http://www.w3.org/2001/vcard-rdf/3.0#>
PREFIX dc:    <http://purl.org/dc/elements/1.1/>
PREFIX year:    <http://sw.deri.org/~aharth/2004/07/dblp/dblp.owl#year>
PREFIX foaf:    <http://xmlns.com/foaf/0.1/>

SELECT ?x ?titl ?creat  ?creat2 ?y ?date
WHERE
 {
    ?x dc:date "2002-01-03" .
    ?x dc:date ?date.
    ?x dc:identifier ?creat.
    ?x dc:creator ?creat2.
    ?x dc:title ?titl.
    ?x year: ?y
 }
After a while an exception is raised:

Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap
space
    at java.util.HashMap.<init>(Unknown Source)
    at
com.hp.hpl.jena.tdb.solver.BindingNodeId.<init>(BindingNodeId.java:40)
    at
com.hp.hpl.jena.tdb.solver.StageMatchTuple$1.convert(StageMatchTuple.java:113)
    at
com.hp.hpl.jena.tdb.solver.StageMatchTuple$1.convert(StageMatchTuple.java:109)
    at org.openjena.atlas.iterator.Iter$4.next(Iter.java:267)
    at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:157)
    at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:596)
    at
org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:46)
    at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:262)
    at
com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:43)
    at
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:86)
    at
com.hp.hpl.jena.sparql.engine.iterator.QueryIterConvert.hasNextBinding(QueryIterConvert.java:54)
    at
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:86)
    at
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:30)
    at
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:86)
    at
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:30)
    at
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:86)
    at
com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:57)
    at
com.hp.hpl.jena.sparql.resultset.ResultSetMem.<init>(ResultSetMem.java:82)
    at
com.hp.hpl.jena.sparql.resultset.TextOutput.write(TextOutput.java:133)
    at
com.hp.hpl.jena.sparql.resultset.TextOutput.write(TextOutput.java:116)
    at
com.hp.hpl.jena.sparql.resultset.TextOutput.write(TextOutput.java:104)
    at
com.hp.hpl.jena.sparql.resultset.TextOutput.format(TextOutput.java:51)
    at
com.hp.hpl.jena.query.ResultSetFormatter.out(ResultSetFormatter.java:96)
    at
com.hp.hpl.jena.query.ResultSetFormatter.out(ResultSetFormatter.java:48)
    at
com.hp.hpl.jena.query.ResultSetFormatter.asText(ResultSetFormatter.java:140)
    at jenaAtom.getResultsToFile(jenaAtom.java:220)
    at mainWindow$10.actionPerformed(mainWindow.java:417)
    at javax.swing.AbstractButton.fireActionPerformed(Unknown Source)
    at javax.swing.AbstractButton$Handler.actionPerformed(Unknown Source)
    at javax.swing.DefaultButtonModel.fireActionPerformed(Unknown Source)
    at javax.swing.DefaultButtonModel.setPressed(Unknown Source)

but when querying with simple query:

PREFIX vcard:    <http://www.w3.org/2001/vcard-rdf/3.0#>
PREFIX dc:    <http://purl.org/dc/elements/1.1/>
PREFIX year:    <http://sw.deri.org/~aharth/2004/07/dblp/dblp.owl#year>
PREFIX foaf:    <http://xmlns.com/foaf/0.1/>

SELECT ?x
WHERE
 {
    ?x dc:date "2002-01-03" .
 }

no exception is raised

What should I do? JVM does not allow me more heap space than 1500MB. Should
I migrate to 64 bit platform?


Thank you very much.
Regards
Jan Mojzis

Re: out of heap while performing query

Posted by Andy Seaborne <an...@apache.org>.
Hi there,

The internal cache for TDB settings for TDB should be (approximately) 
good for one dataset and leaving some space for the application. Things 
do depend on the data characteristics, particularly the size of literals 
(any large text literals in the datastore?).

I usually find the DB takes about 1G, leaving the rest of the heap to 
the rest of the app.  You're running the swing and the app and TDB all 
in one JVM.

The query itself isn't very expensive in memory terms (TDB streams 
internally very heavily) but you are calling the text formatter to get 
text.  To find the column sizes, the formatter has to do a pass over the 
results looking at the size of items to put in columns then a pass to 
format the output.  That means it has to keep a copy of the entire results.

Try an output form (JSON?) that does not require a temporary copy.

You might consider putting the SPARQL database in a server (Fuseki) and 
making remote calls to it (same machine or different).  This gets around 
the annoying 32-bit Java-ism of only 1.5G heap.

Tuning the cache needs a recompile or using the undocumented properties 
file - it's simply setting constants in SystemTDB. See 
SystemTDB.readPropertiesFile.

	Andy

On 23/09/11 10:16, Ján Mojžiš wrote:
> Hi there,
> I would appreciate any help with my problem. In jena tdb I query TDB datset
> (very large) with:
>
> PREFIX vcard:<http://www.w3.org/2001/vcard-rdf/3.0#>
> PREFIX dc:<http://purl.org/dc/elements/1.1/>
> PREFIX year:<http://sw.deri.org/~aharth/2004/07/dblp/dblp.owl#year>
> PREFIX foaf:<http://xmlns.com/foaf/0.1/>
>
> SELECT ?x ?titl ?creat  ?creat2 ?y ?date
> WHERE
>   {
>      ?x dc:date "2002-01-03" .
>      ?x dc:date ?date.
>      ?x dc:identifier ?creat.
>      ?x dc:creator ?creat2.
>      ?x dc:title ?titl.
>      ?x year: ?y
>   }
> After a while an exception is raised:
>
> Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap
> space
>      at java.util.HashMap.<init>(Unknown Source)
>      at
> com.hp.hpl.jena.tdb.solver.BindingNodeId.<init>(BindingNodeId.java:40)
>      at
> com.hp.hpl.jena.tdb.solver.StageMatchTuple$1.convert(StageMatchTuple.java:113)
>      at
> com.hp.hpl.jena.tdb.solver.StageMatchTuple$1.convert(StageMatchTuple.java:109)
>      at org.openjena.atlas.iterator.Iter$4.next(Iter.java:267)
>      at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:157)
>      at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:596)
>      at
> org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:46)
>      at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:262)
>      at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:43)
>      at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:86)
>      at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterConvert.hasNextBinding(QueryIterConvert.java:54)
>      at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:86)
>      at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:30)
>      at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:86)
>      at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:30)
>      at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:86)
>      at
> com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:57)
>      at
> com.hp.hpl.jena.sparql.resultset.ResultSetMem.<init>(ResultSetMem.java:82)
>      at
> com.hp.hpl.jena.sparql.resultset.TextOutput.write(TextOutput.java:133)
>      at
> com.hp.hpl.jena.sparql.resultset.TextOutput.write(TextOutput.java:116)
>      at
> com.hp.hpl.jena.sparql.resultset.TextOutput.write(TextOutput.java:104)
>      at
> com.hp.hpl.jena.sparql.resultset.TextOutput.format(TextOutput.java:51)
>      at
> com.hp.hpl.jena.query.ResultSetFormatter.out(ResultSetFormatter.java:96)
>      at
> com.hp.hpl.jena.query.ResultSetFormatter.out(ResultSetFormatter.java:48)
>      at
> com.hp.hpl.jena.query.ResultSetFormatter.asText(ResultSetFormatter.java:140)
>      at jenaAtom.getResultsToFile(jenaAtom.java:220)
>      at mainWindow$10.actionPerformed(mainWindow.java:417)
>      at javax.swing.AbstractButton.fireActionPerformed(Unknown Source)
>      at javax.swing.AbstractButton$Handler.actionPerformed(Unknown Source)
>      at javax.swing.DefaultButtonModel.fireActionPerformed(Unknown Source)
>      at javax.swing.DefaultButtonModel.setPressed(Unknown Source)
>
> but when querying with simple query:
>
> PREFIX vcard:<http://www.w3.org/2001/vcard-rdf/3.0#>
> PREFIX dc:<http://purl.org/dc/elements/1.1/>
> PREFIX year:<http://sw.deri.org/~aharth/2004/07/dblp/dblp.owl#year>
> PREFIX foaf:<http://xmlns.com/foaf/0.1/>
>
> SELECT ?x
> WHERE
>   {
>      ?x dc:date "2002-01-03" .
>   }
>
> no exception is raised
>
> What should I do? JVM does not allow me more heap space than 1500MB. Should
> I migrate to 64 bit platform?
>
>
> Thank you very much.
> Regards
> Jan Mojzis
>