You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Stephen Allen (JIRA)" <ji...@apache.org> on 2012/09/14 22:40:07 UTC

[jira] [Created] (JENA-321) Update notification events are fired on a per Graph basis instead of GraphStore

Stephen Allen created JENA-321:
----------------------------------

             Summary: Update notification events are fired on a per Graph basis instead of GraphStore
                 Key: JENA-321
                 URL: https://issues.apache.org/jira/browse/JENA-321
             Project: Apache Jena
          Issue Type: Improvement
          Components: ARQ
            Reporter: Stephen Allen
            Priority: Minor


Before every update operation starts, UpdateEngineMain attempts to fire notification events to listeners
that an update is about to occur.  Unfortunately, it tries to fire an event for each named graph in the system.  Because TDB represents named graphs as quads, the only way to get a list of all the named
graphs to fire an event for is to perform an entire table scan, project just the graph part of the quad and then perform a distinct operation.

There are a few problems with this approach:
# This is pretty dang inefficient, as the entire database is scanned on every update query
# With a large number of named graphs, you have to fire a lot of events, which is also inefficient
# If you have a lot of named graphs, the distinct operation has to store every graph name in an in-memory hashset

A user appears to have run into issue 3).  The underlying cause seems to be a mismatch in the design of the graph notification.  This needs to be redesigned to fire a single event for the entire graphstore.

Relevant code in DatasetGraphTDB.java (line 262).



14:08:23 WARN  Fuseki               :: [1] RC = 500 : Java heap space
java.lang.OutOfMemoryError: Java heap space
	at java.util.HashMap.resize(HashMap.java:462)
	at java.util.HashMap.addEntry(HashMap.java:755)
	at java.util.HashMap.put(HashMap.java:385)
	at java.util.HashSet.add(HashSet.java:200)
	at org.openjena.atlas.iterator.FilterUnique.accept(FilterUnique.java:35)
	at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:187)
	at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
	at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.actionAll(GraphStoreUtils.java:43)
	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.sendToAll(GraphStoreUtils.java:33)
	at com.hp.hpl.jena.sparql.modify.GraphStoreBasic.startRequest(GraphStoreBasic.java:53)
	at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:37)
	at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
	at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:330)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:323)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:303)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:255)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:235)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:226)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:122)
	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:97)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:78)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:480)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:941)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
14:08:23 INFO  Fuseki               :: [1] 500 Java heap space

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (JENA-321) Update notification events are fired on a per Graph basis instead of GraphStore

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andy Seaborne resolved JENA-321.
--------------------------------

       Resolution: Fixed
    Fix Version/s: ARQ 2.9.4
    
> Update notification events are fired on a per Graph basis instead of GraphStore
> -------------------------------------------------------------------------------
>
>                 Key: JENA-321
>                 URL: https://issues.apache.org/jira/browse/JENA-321
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ
>            Reporter: Stephen Allen
>            Assignee: Andy Seaborne
>            Priority: Minor
>             Fix For: ARQ 2.9.4
>
>
> Before every update operation starts, UpdateEngineMain attempts to fire notification events to listeners that an update is about to occur.  Unfortunately, it tries to fire an event for each named graph in the system.  Because TDB represents named graphs as quads, the only way to get a list of all the named
> graphs to fire an event for is to perform an entire table scan, project just the graph part of the quad and then perform a distinct operation.
> There are a few problems with this approach:
>   1) This is pretty dang inefficient, as the entire database is scanned on every update query
>   2) With a large number of named graphs, you have to fire a lot of events, which is also inefficient
>   3) If you have a lot of named graphs, the distinct operation has to store every graph name in an in-memory hashset
> A user appears to have run into issue 3).  The underlying cause seems to be a mismatch in the design of the graph notification.  This needs to be redesigned to fire a single event for the entire graphstore.
> Relevant code in DatasetGraphTDB.java (line 262).
> 14:08:23 WARN  Fuseki               :: [1] RC = 500 : Java heap space
> java.lang.OutOfMemoryError: Java heap space
> 	at java.util.HashMap.resize(HashMap.java:462)
> 	at java.util.HashMap.addEntry(HashMap.java:755)
> 	at java.util.HashMap.put(HashMap.java:385)
> 	at java.util.HashSet.add(HashSet.java:200)
> 	at org.openjena.atlas.iterator.FilterUnique.accept(FilterUnique.java:35)
> 	at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:187)
> 	at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
> 	at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.actionAll(GraphStoreUtils.java:43)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.sendToAll(GraphStoreUtils.java:33)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreBasic.startRequest(GraphStoreBasic.java:53)
> 	at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:37)
> 	at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
> 	at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:330)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:323)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:303)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:255)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:235)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:226)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:122)
> 	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:97)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:78)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> 	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)
> 	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:480)
> 	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:941)
> 	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)
> 	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875)
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
> 14:08:23 INFO  Fuseki               :: [1] 500 Java heap space

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-321) Update notification events are fired on a per Graph basis instead of GraphStore

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456407#comment-13456407 ] 

Andy Seaborne commented on JENA-321:
------------------------------------

Proposal: start request does not send an event to every graph, just to the dataset object. I don't think anything relies on GS-triggered events.

For now, before an event model is sorted out, no event is better than a contract we don't want later.
(Ditto finish request.)

Long term: a formal contract in the events on a graphstore/dataset and the relationship to graph events. There are distinct kinds of dataset - ones that are raw storage and ones that are a collection of graphs.  Getting perfect uniformity may not make sense -- hard/costly to have triple events on quad actions where the graph is a view of the dataset.

See also a discussion on JENA-189
                
> Update notification events are fired on a per Graph basis instead of GraphStore
> -------------------------------------------------------------------------------
>
>                 Key: JENA-321
>                 URL: https://issues.apache.org/jira/browse/JENA-321
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ
>            Reporter: Stephen Allen
>            Priority: Minor
>
> Before every update operation starts, UpdateEngineMain attempts to fire notification events to listeners that an update is about to occur.  Unfortunately, it tries to fire an event for each named graph in the system.  Because TDB represents named graphs as quads, the only way to get a list of all the named
> graphs to fire an event for is to perform an entire table scan, project just the graph part of the quad and then perform a distinct operation.
> There are a few problems with this approach:
>   1) This is pretty dang inefficient, as the entire database is scanned on every update query
>   2) With a large number of named graphs, you have to fire a lot of events, which is also inefficient
>   3) If you have a lot of named graphs, the distinct operation has to store every graph name in an in-memory hashset
> A user appears to have run into issue 3).  The underlying cause seems to be a mismatch in the design of the graph notification.  This needs to be redesigned to fire a single event for the entire graphstore.
> Relevant code in DatasetGraphTDB.java (line 262).
> 14:08:23 WARN  Fuseki               :: [1] RC = 500 : Java heap space
> java.lang.OutOfMemoryError: Java heap space
> 	at java.util.HashMap.resize(HashMap.java:462)
> 	at java.util.HashMap.addEntry(HashMap.java:755)
> 	at java.util.HashMap.put(HashMap.java:385)
> 	at java.util.HashSet.add(HashSet.java:200)
> 	at org.openjena.atlas.iterator.FilterUnique.accept(FilterUnique.java:35)
> 	at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:187)
> 	at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
> 	at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.actionAll(GraphStoreUtils.java:43)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.sendToAll(GraphStoreUtils.java:33)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreBasic.startRequest(GraphStoreBasic.java:53)
> 	at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:37)
> 	at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
> 	at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:330)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:323)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:303)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:255)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:235)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:226)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:122)
> 	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:97)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:78)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> 	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)
> 	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:480)
> 	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:941)
> 	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)
> 	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875)
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
> 14:08:23 INFO  Fuseki               :: [1] 500 Java heap space

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (JENA-321) Update notification events are fired on a per Graph basis instead of GraphStore

Posted by "Stephen Allen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Allen updated JENA-321:
-------------------------------

    Description: 
Before every update operation starts, UpdateEngineMain attempts to fire notification events to listeners that an update is about to occur.  Unfortunately, it tries to fire an event for each named graph in the system.  Because TDB represents named graphs as quads, the only way to get a list of all the named
graphs to fire an event for is to perform an entire table scan, project just the graph part of the quad and then perform a distinct operation.

There are a few problems with this approach:
  1) This is pretty dang inefficient, as the entire database is scanned on every update query
  2) With a large number of named graphs, you have to fire a lot of events, which is also inefficient
  3) If you have a lot of named graphs, the distinct operation has to store every graph name in an in-memory hashset

A user appears to have run into issue 3).  The underlying cause seems to be a mismatch in the design of the graph notification.  This needs to be redesigned to fire a single event for the entire graphstore.

Relevant code in DatasetGraphTDB.java (line 262).



14:08:23 WARN  Fuseki               :: [1] RC = 500 : Java heap space
java.lang.OutOfMemoryError: Java heap space
	at java.util.HashMap.resize(HashMap.java:462)
	at java.util.HashMap.addEntry(HashMap.java:755)
	at java.util.HashMap.put(HashMap.java:385)
	at java.util.HashSet.add(HashSet.java:200)
	at org.openjena.atlas.iterator.FilterUnique.accept(FilterUnique.java:35)
	at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:187)
	at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
	at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.actionAll(GraphStoreUtils.java:43)
	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.sendToAll(GraphStoreUtils.java:33)
	at com.hp.hpl.jena.sparql.modify.GraphStoreBasic.startRequest(GraphStoreBasic.java:53)
	at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:37)
	at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
	at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:330)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:323)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:303)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:255)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:235)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:226)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:122)
	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:97)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:78)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:480)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:941)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
14:08:23 INFO  Fuseki               :: [1] 500 Java heap space

  was:
Before every update operation starts, UpdateEngineMain attempts to fire notification events to listeners
that an update is about to occur.  Unfortunately, it tries to fire an event for each named graph in the system.  Because TDB represents named graphs as quads, the only way to get a list of all the named
graphs to fire an event for is to perform an entire table scan, project just the graph part of the quad and then perform a distinct operation.

There are a few problems with this approach:
  1) This is pretty dang inefficient, as the entire database is scanned on every update query
  2) With a large number of named graphs, you have to fire a lot of events, which is also inefficient
  3) If you have a lot of named graphs, the distinct operation has to store every graph name in an in-memory hashset

A user appears to have run into issue 3).  The underlying cause seems to be a mismatch in the design of the graph notification.  This needs to be redesigned to fire a single event for the entire graphstore.

Relevant code in DatasetGraphTDB.java (line 262).



14:08:23 WARN  Fuseki               :: [1] RC = 500 : Java heap space
java.lang.OutOfMemoryError: Java heap space
	at java.util.HashMap.resize(HashMap.java:462)
	at java.util.HashMap.addEntry(HashMap.java:755)
	at java.util.HashMap.put(HashMap.java:385)
	at java.util.HashSet.add(HashSet.java:200)
	at org.openjena.atlas.iterator.FilterUnique.accept(FilterUnique.java:35)
	at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:187)
	at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
	at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.actionAll(GraphStoreUtils.java:43)
	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.sendToAll(GraphStoreUtils.java:33)
	at com.hp.hpl.jena.sparql.modify.GraphStoreBasic.startRequest(GraphStoreBasic.java:53)
	at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:37)
	at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
	at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:330)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:323)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:303)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:255)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:235)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:226)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:122)
	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:97)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:78)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:480)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:941)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
14:08:23 INFO  Fuseki               :: [1] 500 Java heap space

    
> Update notification events are fired on a per Graph basis instead of GraphStore
> -------------------------------------------------------------------------------
>
>                 Key: JENA-321
>                 URL: https://issues.apache.org/jira/browse/JENA-321
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ
>            Reporter: Stephen Allen
>            Priority: Minor
>
> Before every update operation starts, UpdateEngineMain attempts to fire notification events to listeners that an update is about to occur.  Unfortunately, it tries to fire an event for each named graph in the system.  Because TDB represents named graphs as quads, the only way to get a list of all the named
> graphs to fire an event for is to perform an entire table scan, project just the graph part of the quad and then perform a distinct operation.
> There are a few problems with this approach:
>   1) This is pretty dang inefficient, as the entire database is scanned on every update query
>   2) With a large number of named graphs, you have to fire a lot of events, which is also inefficient
>   3) If you have a lot of named graphs, the distinct operation has to store every graph name in an in-memory hashset
> A user appears to have run into issue 3).  The underlying cause seems to be a mismatch in the design of the graph notification.  This needs to be redesigned to fire a single event for the entire graphstore.
> Relevant code in DatasetGraphTDB.java (line 262).
> 14:08:23 WARN  Fuseki               :: [1] RC = 500 : Java heap space
> java.lang.OutOfMemoryError: Java heap space
> 	at java.util.HashMap.resize(HashMap.java:462)
> 	at java.util.HashMap.addEntry(HashMap.java:755)
> 	at java.util.HashMap.put(HashMap.java:385)
> 	at java.util.HashSet.add(HashSet.java:200)
> 	at org.openjena.atlas.iterator.FilterUnique.accept(FilterUnique.java:35)
> 	at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:187)
> 	at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
> 	at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.actionAll(GraphStoreUtils.java:43)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.sendToAll(GraphStoreUtils.java:33)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreBasic.startRequest(GraphStoreBasic.java:53)
> 	at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:37)
> 	at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
> 	at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:330)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:323)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:303)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:255)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:235)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:226)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:122)
> 	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:97)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:78)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> 	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)
> 	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:480)
> 	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:941)
> 	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)
> 	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875)
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
> 14:08:23 INFO  Fuseki               :: [1] 500 Java heap space

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Closed] (JENA-321) Update notification events are fired on a per Graph basis instead of GraphStore

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andy Seaborne closed JENA-321.
------------------------------

    
> Update notification events are fired on a per Graph basis instead of GraphStore
> -------------------------------------------------------------------------------
>
>                 Key: JENA-321
>                 URL: https://issues.apache.org/jira/browse/JENA-321
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ
>            Reporter: Stephen Allen
>            Assignee: Andy Seaborne
>            Priority: Minor
>             Fix For: ARQ 2.9.4
>
>
> Before every update operation starts, UpdateEngineMain attempts to fire notification events to listeners that an update is about to occur.  Unfortunately, it tries to fire an event for each named graph in the system.  Because TDB represents named graphs as quads, the only way to get a list of all the named
> graphs to fire an event for is to perform an entire table scan, project just the graph part of the quad and then perform a distinct operation.
> There are a few problems with this approach:
>   1) This is pretty dang inefficient, as the entire database is scanned on every update query
>   2) With a large number of named graphs, you have to fire a lot of events, which is also inefficient
>   3) If you have a lot of named graphs, the distinct operation has to store every graph name in an in-memory hashset
> A user appears to have run into issue 3).  The underlying cause seems to be a mismatch in the design of the graph notification.  This needs to be redesigned to fire a single event for the entire graphstore.
> Relevant code in DatasetGraphTDB.java (line 262).
> 14:08:23 WARN  Fuseki               :: [1] RC = 500 : Java heap space
> java.lang.OutOfMemoryError: Java heap space
> 	at java.util.HashMap.resize(HashMap.java:462)
> 	at java.util.HashMap.addEntry(HashMap.java:755)
> 	at java.util.HashMap.put(HashMap.java:385)
> 	at java.util.HashSet.add(HashSet.java:200)
> 	at org.openjena.atlas.iterator.FilterUnique.accept(FilterUnique.java:35)
> 	at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:187)
> 	at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
> 	at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.actionAll(GraphStoreUtils.java:43)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.sendToAll(GraphStoreUtils.java:33)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreBasic.startRequest(GraphStoreBasic.java:53)
> 	at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:37)
> 	at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
> 	at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:330)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:323)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:303)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:255)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:235)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:226)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:122)
> 	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:97)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:78)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> 	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)
> 	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:480)
> 	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:941)
> 	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)
> 	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875)
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
> 14:08:23 INFO  Fuseki               :: [1] 500 Java heap space

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (JENA-321) Update notification events are fired on a per Graph basis instead of GraphStore

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456407#comment-13456407 ] 

Andy Seaborne edited comment on JENA-321 at 9/16/12 2:44 AM:
-------------------------------------------------------------

The first step is to remove this inefficiency. 

Proposal: (immediate) don't send events on start and finish request.

For now, before an event model for datasets is sorted out, no event is better than a contract we don't want later.
(Ditto finish request.)

Fix applied.

This is indicative of a fundamental design issue: do the graphs of a dataset see dataset events? There are two distinct classes of dataset - ones that are raw storage and ones that are a collection of graphs.  

At the moment, it looks to me like the best approach is that operations of graphs cause graph events but operations on datasets do not cause graph events (they may indirectly through operations that call the graph add/delete).  This seems to be the only approach that gives the same semantics for all datasets.

See also a discussion on JENA-189
                
      was (Author: andy.seaborne):
    Proposal: start request does not send an event to every graph, just to the dataset object. I don't think anything relies on GS-triggered events.

For now, before an event model is sorted out, no event is better than a contract we don't want later.
(Ditto finish request.)

Long term: a formal contract in the events on a graphstore/dataset and the relationship to graph events. There are distinct kinds of dataset - ones that are raw storage and ones that are a collection of graphs.  Getting perfect uniformity may not make sense -- hard/costly to have triple events on quad actions where the graph is a view of the dataset.

See also a discussion on JENA-189
                  
> Update notification events are fired on a per Graph basis instead of GraphStore
> -------------------------------------------------------------------------------
>
>                 Key: JENA-321
>                 URL: https://issues.apache.org/jira/browse/JENA-321
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ
>            Reporter: Stephen Allen
>            Assignee: Andy Seaborne
>            Priority: Minor
>
> Before every update operation starts, UpdateEngineMain attempts to fire notification events to listeners that an update is about to occur.  Unfortunately, it tries to fire an event for each named graph in the system.  Because TDB represents named graphs as quads, the only way to get a list of all the named
> graphs to fire an event for is to perform an entire table scan, project just the graph part of the quad and then perform a distinct operation.
> There are a few problems with this approach:
>   1) This is pretty dang inefficient, as the entire database is scanned on every update query
>   2) With a large number of named graphs, you have to fire a lot of events, which is also inefficient
>   3) If you have a lot of named graphs, the distinct operation has to store every graph name in an in-memory hashset
> A user appears to have run into issue 3).  The underlying cause seems to be a mismatch in the design of the graph notification.  This needs to be redesigned to fire a single event for the entire graphstore.
> Relevant code in DatasetGraphTDB.java (line 262).
> 14:08:23 WARN  Fuseki               :: [1] RC = 500 : Java heap space
> java.lang.OutOfMemoryError: Java heap space
> 	at java.util.HashMap.resize(HashMap.java:462)
> 	at java.util.HashMap.addEntry(HashMap.java:755)
> 	at java.util.HashMap.put(HashMap.java:385)
> 	at java.util.HashSet.add(HashSet.java:200)
> 	at org.openjena.atlas.iterator.FilterUnique.accept(FilterUnique.java:35)
> 	at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:187)
> 	at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
> 	at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.actionAll(GraphStoreUtils.java:43)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.sendToAll(GraphStoreUtils.java:33)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreBasic.startRequest(GraphStoreBasic.java:53)
> 	at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:37)
> 	at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
> 	at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:330)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:323)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:303)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:255)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:235)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:226)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:122)
> 	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:97)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:78)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> 	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)
> 	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:480)
> 	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:941)
> 	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)
> 	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875)
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
> 14:08:23 INFO  Fuseki               :: [1] 500 Java heap space

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (JENA-321) Update notification events are fired on a per Graph basis instead of GraphStore

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andy Seaborne reassigned JENA-321:
----------------------------------

    Assignee: Andy Seaborne
    
> Update notification events are fired on a per Graph basis instead of GraphStore
> -------------------------------------------------------------------------------
>
>                 Key: JENA-321
>                 URL: https://issues.apache.org/jira/browse/JENA-321
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ
>            Reporter: Stephen Allen
>            Assignee: Andy Seaborne
>            Priority: Minor
>
> Before every update operation starts, UpdateEngineMain attempts to fire notification events to listeners that an update is about to occur.  Unfortunately, it tries to fire an event for each named graph in the system.  Because TDB represents named graphs as quads, the only way to get a list of all the named
> graphs to fire an event for is to perform an entire table scan, project just the graph part of the quad and then perform a distinct operation.
> There are a few problems with this approach:
>   1) This is pretty dang inefficient, as the entire database is scanned on every update query
>   2) With a large number of named graphs, you have to fire a lot of events, which is also inefficient
>   3) If you have a lot of named graphs, the distinct operation has to store every graph name in an in-memory hashset
> A user appears to have run into issue 3).  The underlying cause seems to be a mismatch in the design of the graph notification.  This needs to be redesigned to fire a single event for the entire graphstore.
> Relevant code in DatasetGraphTDB.java (line 262).
> 14:08:23 WARN  Fuseki               :: [1] RC = 500 : Java heap space
> java.lang.OutOfMemoryError: Java heap space
> 	at java.util.HashMap.resize(HashMap.java:462)
> 	at java.util.HashMap.addEntry(HashMap.java:755)
> 	at java.util.HashMap.put(HashMap.java:385)
> 	at java.util.HashSet.add(HashSet.java:200)
> 	at org.openjena.atlas.iterator.FilterUnique.accept(FilterUnique.java:35)
> 	at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:187)
> 	at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
> 	at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.actionAll(GraphStoreUtils.java:43)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.sendToAll(GraphStoreUtils.java:33)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreBasic.startRequest(GraphStoreBasic.java:53)
> 	at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:37)
> 	at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
> 	at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:330)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:323)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:303)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:255)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:235)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:226)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:122)
> 	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:97)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:78)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> 	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)
> 	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:480)
> 	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:941)
> 	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)
> 	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875)
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
> 14:08:23 INFO  Fuseki               :: [1] 500 Java heap space

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (JENA-321) Update notification events are fired on a per Graph basis instead of GraphStore

Posted by "Stephen Allen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Allen updated JENA-321:
-------------------------------

    Description: 
Before every update operation starts, UpdateEngineMain attempts to fire notification events to listeners
that an update is about to occur.  Unfortunately, it tries to fire an event for each named graph in the system.  Because TDB represents named graphs as quads, the only way to get a list of all the named
graphs to fire an event for is to perform an entire table scan, project just the graph part of the quad and then perform a distinct operation.

There are a few problems with this approach:
  1) This is pretty dang inefficient, as the entire database is scanned on every update query
  2) With a large number of named graphs, you have to fire a lot of events, which is also inefficient
  3) If you have a lot of named graphs, the distinct operation has to store every graph name in an in-memory hashset

A user appears to have run into issue 3).  The underlying cause seems to be a mismatch in the design of the graph notification.  This needs to be redesigned to fire a single event for the entire graphstore.

Relevant code in DatasetGraphTDB.java (line 262).



14:08:23 WARN  Fuseki               :: [1] RC = 500 : Java heap space
java.lang.OutOfMemoryError: Java heap space
	at java.util.HashMap.resize(HashMap.java:462)
	at java.util.HashMap.addEntry(HashMap.java:755)
	at java.util.HashMap.put(HashMap.java:385)
	at java.util.HashSet.add(HashSet.java:200)
	at org.openjena.atlas.iterator.FilterUnique.accept(FilterUnique.java:35)
	at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:187)
	at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
	at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.actionAll(GraphStoreUtils.java:43)
	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.sendToAll(GraphStoreUtils.java:33)
	at com.hp.hpl.jena.sparql.modify.GraphStoreBasic.startRequest(GraphStoreBasic.java:53)
	at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:37)
	at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
	at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:330)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:323)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:303)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:255)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:235)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:226)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:122)
	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:97)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:78)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:480)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:941)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
14:08:23 INFO  Fuseki               :: [1] 500 Java heap space

  was:
Before every update operation starts, UpdateEngineMain attempts to fire notification events to listeners
that an update is about to occur.  Unfortunately, it tries to fire an event for each named graph in the system.  Because TDB represents named graphs as quads, the only way to get a list of all the named
graphs to fire an event for is to perform an entire table scan, project just the graph part of the quad and then perform a distinct operation.

There are a few problems with this approach:
# This is pretty dang inefficient, as the entire database is scanned on every update query
# With a large number of named graphs, you have to fire a lot of events, which is also inefficient
# If you have a lot of named graphs, the distinct operation has to store every graph name in an in-memory hashset

A user appears to have run into issue 3).  The underlying cause seems to be a mismatch in the design of the graph notification.  This needs to be redesigned to fire a single event for the entire graphstore.

Relevant code in DatasetGraphTDB.java (line 262).



14:08:23 WARN  Fuseki               :: [1] RC = 500 : Java heap space
java.lang.OutOfMemoryError: Java heap space
	at java.util.HashMap.resize(HashMap.java:462)
	at java.util.HashMap.addEntry(HashMap.java:755)
	at java.util.HashMap.put(HashMap.java:385)
	at java.util.HashSet.add(HashSet.java:200)
	at org.openjena.atlas.iterator.FilterUnique.accept(FilterUnique.java:35)
	at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:187)
	at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
	at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.actionAll(GraphStoreUtils.java:43)
	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.sendToAll(GraphStoreUtils.java:33)
	at com.hp.hpl.jena.sparql.modify.GraphStoreBasic.startRequest(GraphStoreBasic.java:53)
	at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:37)
	at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
	at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:330)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:323)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:303)
	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:255)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:235)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:226)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:122)
	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:97)
	at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:78)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:480)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:941)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
14:08:23 INFO  Fuseki               :: [1] 500 Java heap space

    
> Update notification events are fired on a per Graph basis instead of GraphStore
> -------------------------------------------------------------------------------
>
>                 Key: JENA-321
>                 URL: https://issues.apache.org/jira/browse/JENA-321
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ
>            Reporter: Stephen Allen
>            Priority: Minor
>
> Before every update operation starts, UpdateEngineMain attempts to fire notification events to listeners
> that an update is about to occur.  Unfortunately, it tries to fire an event for each named graph in the system.  Because TDB represents named graphs as quads, the only way to get a list of all the named
> graphs to fire an event for is to perform an entire table scan, project just the graph part of the quad and then perform a distinct operation.
> There are a few problems with this approach:
>   1) This is pretty dang inefficient, as the entire database is scanned on every update query
>   2) With a large number of named graphs, you have to fire a lot of events, which is also inefficient
>   3) If you have a lot of named graphs, the distinct operation has to store every graph name in an in-memory hashset
> A user appears to have run into issue 3).  The underlying cause seems to be a mismatch in the design of the graph notification.  This needs to be redesigned to fire a single event for the entire graphstore.
> Relevant code in DatasetGraphTDB.java (line 262).
> 14:08:23 WARN  Fuseki               :: [1] RC = 500 : Java heap space
> java.lang.OutOfMemoryError: Java heap space
> 	at java.util.HashMap.resize(HashMap.java:462)
> 	at java.util.HashMap.addEntry(HashMap.java:755)
> 	at java.util.HashMap.put(HashMap.java:385)
> 	at java.util.HashSet.add(HashSet.java:200)
> 	at org.openjena.atlas.iterator.FilterUnique.accept(FilterUnique.java:35)
> 	at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:187)
> 	at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
> 	at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.actionAll(GraphStoreUtils.java:43)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.sendToAll(GraphStoreUtils.java:33)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreBasic.startRequest(GraphStoreBasic.java:53)
> 	at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:37)
> 	at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
> 	at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:330)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:323)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:303)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:255)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:235)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:226)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:122)
> 	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:97)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:78)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> 	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)
> 	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:480)
> 	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:941)
> 	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)
> 	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875)
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
> 14:08:23 INFO  Fuseki               :: [1] 500 Java heap space

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira