You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Andy Seaborne (JIRA)" <ji...@apache.org> on 2012/09/15 15:24:07 UTC

[jira] [Commented] (JENA-321) Update notification events are fired on a per Graph basis instead of GraphStore

    [ https://issues.apache.org/jira/browse/JENA-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456407#comment-13456407 ] 

Andy Seaborne commented on JENA-321:
------------------------------------

Proposal: start request does not send an event to every graph, just to the dataset object. I don't think anything relies on GS-triggered events.

For now, before an event model is sorted out, no event is better than a contract we don't want later.
(Ditto finish request.)

Long term: a formal contract in the events on a graphstore/dataset and the relationship to graph events. There are distinct kinds of dataset - ones that are raw storage and ones that are a collection of graphs.  Getting perfect uniformity may not make sense -- hard/costly to have triple events on quad actions where the graph is a view of the dataset.

See also a discussion on JENA-189
                
> Update notification events are fired on a per Graph basis instead of GraphStore
> -------------------------------------------------------------------------------
>
>                 Key: JENA-321
>                 URL: https://issues.apache.org/jira/browse/JENA-321
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ
>            Reporter: Stephen Allen
>            Priority: Minor
>
> Before every update operation starts, UpdateEngineMain attempts to fire notification events to listeners that an update is about to occur.  Unfortunately, it tries to fire an event for each named graph in the system.  Because TDB represents named graphs as quads, the only way to get a list of all the named
> graphs to fire an event for is to perform an entire table scan, project just the graph part of the quad and then perform a distinct operation.
> There are a few problems with this approach:
>   1) This is pretty dang inefficient, as the entire database is scanned on every update query
>   2) With a large number of named graphs, you have to fire a lot of events, which is also inefficient
>   3) If you have a lot of named graphs, the distinct operation has to store every graph name in an in-memory hashset
> A user appears to have run into issue 3).  The underlying cause seems to be a mismatch in the design of the graph notification.  This needs to be redesigned to fire a single event for the entire graphstore.
> Relevant code in DatasetGraphTDB.java (line 262).
> 14:08:23 WARN  Fuseki               :: [1] RC = 500 : Java heap space
> java.lang.OutOfMemoryError: Java heap space
> 	at java.util.HashMap.resize(HashMap.java:462)
> 	at java.util.HashMap.addEntry(HashMap.java:755)
> 	at java.util.HashMap.put(HashMap.java:385)
> 	at java.util.HashSet.add(HashSet.java:200)
> 	at org.openjena.atlas.iterator.FilterUnique.accept(FilterUnique.java:35)
> 	at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:187)
> 	at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
> 	at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.actionAll(GraphStoreUtils.java:43)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreUtils.sendToAll(GraphStoreUtils.java:33)
> 	at com.hp.hpl.jena.sparql.modify.GraphStoreBasic.startRequest(GraphStoreBasic.java:53)
> 	at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:37)
> 	at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
> 	at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:330)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:323)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:303)
> 	at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:255)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:235)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:226)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:122)
> 	at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:97)
> 	at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:78)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> 	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)
> 	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:480)
> 	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:941)
> 	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)
> 	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875)
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
> 14:08:23 INFO  Fuseki               :: [1] 500 Java heap space

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira