You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by "Neubert, Joachim" <J....@zbw.eu> on 2015/09/24 21:59:55 UTC

Fuseki2: GC overhead limit exceeded when loading data via HTTP PUT

I got an OutOfMemoryError while loading a turtle file with less than 8000 triples into a named graph, which however already consisted of 45 million triples (details see below - there are a few other graphs in the tdb, with about 100K triples or less).

After almost half an hour, I got the error below. The (virtual) machine has 32 GB of memory, Fuseki2 (exact versions see below) was started with JAVA_OPTIONS="-Xmx6G"

Any ideas?

20:50:14 INFO  [37] PUT http://localhost:3030/ebstw/data?graph=http://zbw.eu/beta/ebds/ng
...
21:16:07 WARN  [37] RC = 500 : GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
        at com.hp.hpl.jena.tdb.base.record.RecordFactory.create(RecordFactory.java:87)
        at com.hp.hpl.jena.tdb.base.record.RecordFactory.buildFrom(RecordFactory.java:122)
        at com.hp.hpl.jena.tdb.base.buffer.RecordBuffer._get(RecordBuffer.java:107)
        at com.hp.hpl.jena.tdb.base.buffer.RecordBuffer.getHigh(RecordBuffer.java:67)
        at com.hp.hpl.jena.tdb.index.bplustree.BPTreeRecords.shiftRight(BPTreeRecords.java:221)
        at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.shiftRight(BPTreeNode.java:1012)
        at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.rebalance(BPTreeNode.java:832)
        at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalDelete(BPTreeNode.java:721)
        at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalDelete(BPTreeNode.java:735)
        at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalDelete(BPTreeNode.java:735)
        at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.delete(BPTreeNode.java:247)
        at com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.deleteAndReturnOld(BPlusTree.java:342)
        at com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.delete(BPlusTree.java:336)
        at com.hp.hpl.jena.tdb.store.tupletable.TupleIndexRecord.performDelete(TupleIndexRecord.java:70)
        at com.hp.hpl.jena.tdb.store.tupletable.TupleIndexBase.delete(TupleIndexBase.java:76)
        at com.hp.hpl.jena.tdb.store.tupletable.TupleTable.delete(TupleTable.java:149)
        at com.hp.hpl.jena.tdb.store.nodetupletable.NodeTupleTableConcrete.deleteRow(NodeTupleTableConcrete.java:110)
        at com.hp.hpl.jena.tdb.store.QuadTable.delete(QuadTable.java:81)
        at com.hp.hpl.jena.tdb.store.DatasetGraphTDB.deleteFromNamedGraph(DatasetGraphTDB.java:112)
        at com.hp.hpl.jena.sparql.core.DatasetGraphTriplesQuads.delete(DatasetGraphTriplesQuads.java:58)
        at com.hp.hpl.jena.sparql.core.DatasetGraphTrackActive.delete(DatasetGraphTrackActive.java:131)
        at com.hp.hpl.jena.sparql.core.DatasetGraphWrapper.delete(DatasetGraphWrapper.java:100)
        at com.hp.hpl.jena.sparql.core.DatasetGraphMonitor.delete$(DatasetGraphMonitor.java:153)
        at com.hp.hpl.jena.sparql.core.DatasetGraphMonitor.delete(DatasetGraphMonitor.java:142)
        at com.hp.hpl.jena.sparql.core.GraphView.performDelete(GraphView.java:153)
        at com.hp.hpl.jena.graph.impl.GraphBase.delete(GraphBase.java:225)
        at com.hp.hpl.jena.graph.GraphUtil.remove(GraphUtil.java:308)
        at com.hp.hpl.jena.graph.impl.GraphBase.clear(GraphBase.java:244)
        at org.apache.jena.fuseki.servlets.SPARQL_REST_RW.clearGraph(SPARQL_REST_RW.java:226)
        at org.apache.jena.fuseki.servlets.SPARQL_REST_RW.addDataIntoTxn(SPARQL_REST_RW.java:129)
        at org.apache.jena.fuseki.servlets.SPARQL_REST_RW.doPutPost(SPARQL_REST_RW.java:102)
        at org.apache.jena.fuseki.servlets.SPARQL_REST_RW.doPut(SPARQL_REST_RW.java:80)
21:16:07 INFO  [37] 500 GC overhead limit exceeded (1.552,938 s)

# java -cp /opt/fuseki/fuseki-server.jar tdb.tdbstats --version
Jena:       VERSION: 2.12.1
Jena:       BUILD_DATE: 2014-10-02T16:36:17+0100
ARQ:        VERSION: 2.12.1
ARQ:        BUILD_DATE: 2014-10-02T16:36:17+0100
RIOT:       VERSION: 2.12.1
RIOT:       BUILD_DATE: 2014-10-02T16:36:17+0100
TDB:        VERSION: 1.1.1
TDB:        BUILD_DATE: 2014-10-02T16:36:17+0100

# java -cp /opt/fuseki/fuseki-server.jar tdb.tdbstats --loc=. --graph=http://zbw.eu/beta/ebds/ng
(stats
  (meta
    (timestamp "2015-09-24T21:49:07.125+02:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>)
    (run@ "2015/09/24 21:49:07 MESZ")
    (count 45087163))
  (<http://purl.org/dc/elements/1.1/type> 4279019)
  (<http://purl.org/dc/elements/1.1/creator> 2505439)
  (<http://purl.org/dc/terms/isPartOf> 189117)
  (<http://purl.org/dc/terms/source> 4279019)
  (<http://purl.org/dc/terms/description> 127946)
  (<http://purl.org/dc/elements/1.1/relation> 2367362)
  (<http://purl.org/dc/terms/creator> 1907335)
  (<http://purl.org/ontology/bibo/issn> 57108)
  (<http://purl.org/dc/elements/1.1/subject> 4263296)
  (<http://umbel.org/umbel#isLike> 1186896)
  (<http://purl.org/dc/terms/title> 4279019)
  (<http://purl.org/dc/terms/contributor> 905965)
  (<http://purl.org/dc/terms/alternative> 318253)
  (<http://purl.org/dc/elements/1.1/date> 4242995)
  (<http://purl.org/dc/terms/subject> 7948468)
  (<http://purl.org/ontology/bibo/isbn> 617274)
  (<http://purl.org/dc/elements/1.1/publisher> 1939890)
  (<http://purl.org/dc/elements/1.1/language> 2851709)
  (<http://purl.org/dc/elements/1.1/contributor> 821053)
  (other 0))

AW: Fuseki2: GC overhead limit exceeded when loading data via HTTP PUT

Posted by "Neubert, Joachim" <J....@zbw.eu>.
Hi Andy,

Sorry so much - POST had been intented, not PUT. I wrote it down several times, and looked at it intensely, but missed that simple error.

Anyway, thank you for the explanation re. large delete transactions, which I've noticed in other places. Normally a deletion of the files and a rebuild of the database from scratch is what worked best for me (in a scenario with relativly static data).

Cheers, Joachim

-----Ursprüngliche Nachricht-----
Von: Andy Seaborne [mailto:andy@apache.org] 
Gesendet: Freitag, 25. September 2015 13:31
An: users@jena.apache.org
Betreff: Re: Fuseki2: GC overhead limit exceeded when loading data via HTTP PUT

Hi Joachim,

The issue is not the 8000 triples being added but the 45 million triples being deleted.  The PUT operation is to delete everythign in the target and then add the new data.

Deletion of large amounts of data is a struggle in one single transaction for TDB currently (newer versions than 2.12.1 will not make a difference).

1. Stephen Allen suggested enabling the "spill to disk"

http://mail-archives.apache.org/mod_mbox/jena-users/201507.mbox/%3CCAPTxtVOZRzyPxN1njh3WVggsJEUNxeXDJhNvx%2BG4WcRtExxPxg%40mail.gmail.com%3E

Other possible workarounds are:

2. Delete sections of the data in separate transactions with SPARQL Update.

3. Dump the database, text process to remove the graph and reload.

4. Large heap, maybe temporarily.  This is one of the few occasions when a larger heap can help.

Workarounds 2-4 are not very transparent to system operation.

(long term, there is a solution as part of rearchitecting TDB but that's a way off yet)

	Sorry there isn't a simple solution,
	Andy

On 24/09/15 20:59, Neubert, Joachim wrote:
> I got an OutOfMemoryError while loading a turtle file with less than 8000 triples into a named graph, which however already consisted of 45 million triples (details see below - there are a few other graphs in the tdb, with about 100K triples or less).
>
> After almost half an hour, I got the error below. The (virtual) machine has 32 GB of memory, Fuseki2 (exact versions see below) was started with JAVA_OPTIONS="-Xmx6G"
>
> Any ideas?
>
> 20:50:14 INFO  [37] PUT 
> http://localhost:3030/ebstw/data?graph=http://zbw.eu/beta/ebds/ng
> ...
> 21:16:07 WARN  [37] RC = 500 : GC overhead limit exceeded
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>          at com.hp.hpl.jena.tdb.base.record.RecordFactory.create(RecordFactory.java:87)
>          at com.hp.hpl.jena.tdb.base.record.RecordFactory.buildFrom(RecordFactory.java:122)
>          at com.hp.hpl.jena.tdb.base.buffer.RecordBuffer._get(RecordBuffer.java:107)
>          at com.hp.hpl.jena.tdb.base.buffer.RecordBuffer.getHigh(RecordBuffer.java:67)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPTreeRecords.shiftRight(BPTreeRecords.java:221)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.shiftRight(BPTreeNode.java:1012)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.rebalance(BPTreeNode.java:832)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalDelete(BPTreeNode.java:721)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalDelete(BPTreeNode.java:735)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalDelete(BPTreeNode.java:735)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.delete(BPTreeNode.java:247)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.deleteAndReturnOld(BPlusTree.java:342)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.delete(BPlusTree.java:336)
>          at com.hp.hpl.jena.tdb.store.tupletable.TupleIndexRecord.performDelete(TupleIndexRecord.java:70)
>          at com.hp.hpl.jena.tdb.store.tupletable.TupleIndexBase.delete(TupleIndexBase.java:76)
>          at com.hp.hpl.jena.tdb.store.tupletable.TupleTable.delete(TupleTable.java:149)
>          at com.hp.hpl.jena.tdb.store.nodetupletable.NodeTupleTableConcrete.deleteRow(NodeTupleTableConcrete.java:110)
>          at com.hp.hpl.jena.tdb.store.QuadTable.delete(QuadTable.java:81)
>          at com.hp.hpl.jena.tdb.store.DatasetGraphTDB.deleteFromNamedGraph(DatasetGraphTDB.java:112)
>          at com.hp.hpl.jena.sparql.core.DatasetGraphTriplesQuads.delete(DatasetGraphTriplesQuads.java:58)
>          at com.hp.hpl.jena.sparql.core.DatasetGraphTrackActive.delete(DatasetGraphTrackActive.java:131)
>          at com.hp.hpl.jena.sparql.core.DatasetGraphWrapper.delete(DatasetGraphWrapper.java:100)
>          at com.hp.hpl.jena.sparql.core.DatasetGraphMonitor.delete$(DatasetGraphMonitor.java:153)
>          at com.hp.hpl.jena.sparql.core.DatasetGraphMonitor.delete(DatasetGraphMonitor.java:142)
>          at com.hp.hpl.jena.sparql.core.GraphView.performDelete(GraphView.java:153)
>          at com.hp.hpl.jena.graph.impl.GraphBase.delete(GraphBase.java:225)
>          at com.hp.hpl.jena.graph.GraphUtil.remove(GraphUtil.java:308)
>          at com.hp.hpl.jena.graph.impl.GraphBase.clear(GraphBase.java:244)
>          at org.apache.jena.fuseki.servlets.SPARQL_REST_RW.clearGraph(SPARQL_REST_RW.java:226)
>          at org.apache.jena.fuseki.servlets.SPARQL_REST_RW.addDataIntoTxn(SPARQL_REST_RW.java:129)
>          at org.apache.jena.fuseki.servlets.SPARQL_REST_RW.doPutPost(SPARQL_REST_RW.java:102)
>          at 
> org.apache.jena.fuseki.servlets.SPARQL_REST_RW.doPut(SPARQL_REST_RW.ja
> va:80)
> 21:16:07 INFO  [37] 500 GC overhead limit exceeded (1.552,938 s)
>
> # java -cp /opt/fuseki/fuseki-server.jar tdb.tdbstats --version
> Jena:       VERSION: 2.12.1
> Jena:       BUILD_DATE: 2014-10-02T16:36:17+0100
> ARQ:        VERSION: 2.12.1
> ARQ:        BUILD_DATE: 2014-10-02T16:36:17+0100
> RIOT:       VERSION: 2.12.1
> RIOT:       BUILD_DATE: 2014-10-02T16:36:17+0100
> TDB:        VERSION: 1.1.1
> TDB:        BUILD_DATE: 2014-10-02T16:36:17+0100
>
> # java -cp /opt/fuseki/fuseki-server.jar tdb.tdbstats --loc=. 
> --graph=http://zbw.eu/beta/ebds/ng
> (stats
>    (meta
>      (timestamp "2015-09-24T21:49:07.125+02:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>)
>      (run@ "2015/09/24 21:49:07 MESZ")
>      (count 45087163))
>    (<http://purl.org/dc/elements/1.1/type> 4279019)
>    (<http://purl.org/dc/elements/1.1/creator> 2505439)
>    (<http://purl.org/dc/terms/isPartOf> 189117)
>    (<http://purl.org/dc/terms/source> 4279019)
>    (<http://purl.org/dc/terms/description> 127946)
>    (<http://purl.org/dc/elements/1.1/relation> 2367362)
>    (<http://purl.org/dc/terms/creator> 1907335)
>    (<http://purl.org/ontology/bibo/issn> 57108)
>    (<http://purl.org/dc/elements/1.1/subject> 4263296)
>    (<http://umbel.org/umbel#isLike> 1186896)
>    (<http://purl.org/dc/terms/title> 4279019)
>    (<http://purl.org/dc/terms/contributor> 905965)
>    (<http://purl.org/dc/terms/alternative> 318253)
>    (<http://purl.org/dc/elements/1.1/date> 4242995)
>    (<http://purl.org/dc/terms/subject> 7948468)
>    (<http://purl.org/ontology/bibo/isbn> 617274)
>    (<http://purl.org/dc/elements/1.1/publisher> 1939890)
>    (<http://purl.org/dc/elements/1.1/language> 2851709)
>    (<http://purl.org/dc/elements/1.1/contributor> 821053)
>    (other 0))
>


Re: Fuseki2: GC overhead limit exceeded when loading data via HTTP PUT

Posted by Andy Seaborne <an...@apache.org>.
Hi Joachim,

The issue is not the 8000 triples being added but the 45 million triples 
being deleted.  The PUT operation is to delete everythign in the target 
and then add the new data.

Deletion of large amounts of data is a struggle in one single 
transaction for TDB currently (newer versions than 2.12.1 will not make 
a difference).

1. Stephen Allen suggested enabling the "spill to disk"

http://mail-archives.apache.org/mod_mbox/jena-users/201507.mbox/%3CCAPTxtVOZRzyPxN1njh3WVggsJEUNxeXDJhNvx%2BG4WcRtExxPxg%40mail.gmail.com%3E

Other possible workarounds are:

2. Delete sections of the data in separate transactions with SPARQL Update.

3. Dump the database, text process to remove the graph and reload.

4. Large heap, maybe temporarily.  This is one of the few occasions when 
a larger heap can help.

Workarounds 2-4 are not very transparent to system operation.

(long term, there is a solution as part of rearchitecting TDB but that's 
a way off yet)

	Sorry there isn't a simple solution,
	Andy

On 24/09/15 20:59, Neubert, Joachim wrote:
> I got an OutOfMemoryError while loading a turtle file with less than 8000 triples into a named graph, which however already consisted of 45 million triples (details see below - there are a few other graphs in the tdb, with about 100K triples or less).
>
> After almost half an hour, I got the error below. The (virtual) machine has 32 GB of memory, Fuseki2 (exact versions see below) was started with JAVA_OPTIONS="-Xmx6G"
>
> Any ideas?
>
> 20:50:14 INFO  [37] PUT http://localhost:3030/ebstw/data?graph=http://zbw.eu/beta/ebds/ng
> ...
> 21:16:07 WARN  [37] RC = 500 : GC overhead limit exceeded
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>          at com.hp.hpl.jena.tdb.base.record.RecordFactory.create(RecordFactory.java:87)
>          at com.hp.hpl.jena.tdb.base.record.RecordFactory.buildFrom(RecordFactory.java:122)
>          at com.hp.hpl.jena.tdb.base.buffer.RecordBuffer._get(RecordBuffer.java:107)
>          at com.hp.hpl.jena.tdb.base.buffer.RecordBuffer.getHigh(RecordBuffer.java:67)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPTreeRecords.shiftRight(BPTreeRecords.java:221)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.shiftRight(BPTreeNode.java:1012)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.rebalance(BPTreeNode.java:832)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalDelete(BPTreeNode.java:721)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalDelete(BPTreeNode.java:735)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalDelete(BPTreeNode.java:735)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.delete(BPTreeNode.java:247)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.deleteAndReturnOld(BPlusTree.java:342)
>          at com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.delete(BPlusTree.java:336)
>          at com.hp.hpl.jena.tdb.store.tupletable.TupleIndexRecord.performDelete(TupleIndexRecord.java:70)
>          at com.hp.hpl.jena.tdb.store.tupletable.TupleIndexBase.delete(TupleIndexBase.java:76)
>          at com.hp.hpl.jena.tdb.store.tupletable.TupleTable.delete(TupleTable.java:149)
>          at com.hp.hpl.jena.tdb.store.nodetupletable.NodeTupleTableConcrete.deleteRow(NodeTupleTableConcrete.java:110)
>          at com.hp.hpl.jena.tdb.store.QuadTable.delete(QuadTable.java:81)
>          at com.hp.hpl.jena.tdb.store.DatasetGraphTDB.deleteFromNamedGraph(DatasetGraphTDB.java:112)
>          at com.hp.hpl.jena.sparql.core.DatasetGraphTriplesQuads.delete(DatasetGraphTriplesQuads.java:58)
>          at com.hp.hpl.jena.sparql.core.DatasetGraphTrackActive.delete(DatasetGraphTrackActive.java:131)
>          at com.hp.hpl.jena.sparql.core.DatasetGraphWrapper.delete(DatasetGraphWrapper.java:100)
>          at com.hp.hpl.jena.sparql.core.DatasetGraphMonitor.delete$(DatasetGraphMonitor.java:153)
>          at com.hp.hpl.jena.sparql.core.DatasetGraphMonitor.delete(DatasetGraphMonitor.java:142)
>          at com.hp.hpl.jena.sparql.core.GraphView.performDelete(GraphView.java:153)
>          at com.hp.hpl.jena.graph.impl.GraphBase.delete(GraphBase.java:225)
>          at com.hp.hpl.jena.graph.GraphUtil.remove(GraphUtil.java:308)
>          at com.hp.hpl.jena.graph.impl.GraphBase.clear(GraphBase.java:244)
>          at org.apache.jena.fuseki.servlets.SPARQL_REST_RW.clearGraph(SPARQL_REST_RW.java:226)
>          at org.apache.jena.fuseki.servlets.SPARQL_REST_RW.addDataIntoTxn(SPARQL_REST_RW.java:129)
>          at org.apache.jena.fuseki.servlets.SPARQL_REST_RW.doPutPost(SPARQL_REST_RW.java:102)
>          at org.apache.jena.fuseki.servlets.SPARQL_REST_RW.doPut(SPARQL_REST_RW.java:80)
> 21:16:07 INFO  [37] 500 GC overhead limit exceeded (1.552,938 s)
>
> # java -cp /opt/fuseki/fuseki-server.jar tdb.tdbstats --version
> Jena:       VERSION: 2.12.1
> Jena:       BUILD_DATE: 2014-10-02T16:36:17+0100
> ARQ:        VERSION: 2.12.1
> ARQ:        BUILD_DATE: 2014-10-02T16:36:17+0100
> RIOT:       VERSION: 2.12.1
> RIOT:       BUILD_DATE: 2014-10-02T16:36:17+0100
> TDB:        VERSION: 1.1.1
> TDB:        BUILD_DATE: 2014-10-02T16:36:17+0100
>
> # java -cp /opt/fuseki/fuseki-server.jar tdb.tdbstats --loc=. --graph=http://zbw.eu/beta/ebds/ng
> (stats
>    (meta
>      (timestamp "2015-09-24T21:49:07.125+02:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>)
>      (run@ "2015/09/24 21:49:07 MESZ")
>      (count 45087163))
>    (<http://purl.org/dc/elements/1.1/type> 4279019)
>    (<http://purl.org/dc/elements/1.1/creator> 2505439)
>    (<http://purl.org/dc/terms/isPartOf> 189117)
>    (<http://purl.org/dc/terms/source> 4279019)
>    (<http://purl.org/dc/terms/description> 127946)
>    (<http://purl.org/dc/elements/1.1/relation> 2367362)
>    (<http://purl.org/dc/terms/creator> 1907335)
>    (<http://purl.org/ontology/bibo/issn> 57108)
>    (<http://purl.org/dc/elements/1.1/subject> 4263296)
>    (<http://umbel.org/umbel#isLike> 1186896)
>    (<http://purl.org/dc/terms/title> 4279019)
>    (<http://purl.org/dc/terms/contributor> 905965)
>    (<http://purl.org/dc/terms/alternative> 318253)
>    (<http://purl.org/dc/elements/1.1/date> 4242995)
>    (<http://purl.org/dc/terms/subject> 7948468)
>    (<http://purl.org/ontology/bibo/isbn> 617274)
>    (<http://purl.org/dc/elements/1.1/publisher> 1939890)
>    (<http://purl.org/dc/elements/1.1/language> 2851709)
>    (<http://purl.org/dc/elements/1.1/contributor> 821053)
>    (other 0))
>