You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Stephen Owens (JIRA)" <ji...@apache.org> on 2012/05/09 15:31:49 UTC

[jira] [Commented] (JENA-244) Deadlock during SPARQL execution on an inference model

    [ https://issues.apache.org/jira/browse/JENA-244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271403#comment-13271403 ] 

Stephen Owens commented on JENA-244:
------------------------------------

Versions involved:
==============
jena-arq-2.9.0-incubating.jar
jena-core-2.7.0-incubating.jar


We discovered a Jena deadlock during some long running stability tests. Jena is being used in a service accessed by a web application so there were multiple web threads accessing the same underlying model. The model was protected by the standard enterCriticalSection calls, set for read because we were just querying. This worked perfectly until we switched to an RDFS model. The RDFS model exhibited occasional deadlocks. 

The deadlock appears to be an issue with out of order lock acquisition. The first thread acquired a lock on LPTopGoalIterator and then tried to get a lock LPBRuleEngine, the second thread had a lock on LPBRuleEngine and tried to get a lock on LPTopGoalIterator with the expected result that nobody is going anywhere from that point forward. I traced through the relevant Jena code and the issue seems to be that one of the two queries is the first query against that model and so is triggering the inferencing model to add rules to the model. In that first query the following sequence happens:

- Because it is the first query against an inference model the FBRuleInfGraph.prepare method triggers the addition of inferences to the model which turns the read only execution of the query into an update of the model. 
- The update uses the LPBRuleEngine.addRule method which is synchronized thus acquiring a lock on LPBRuleEngine. 
- Before trying to update it calls LPBRuleEngine.checkSafeToUpdate to see if there are any outstanding queries.
- Surprisingly, at least to me in reviewing the code, checkSafeToUpdate tries to close any statement iterators it finds open. I don't see how this could be a safe thing to do in a multi-threaded environment but I don't understand the code well enough to be sure. 
- That close operation calls LPTopGoalIterator.close which is a synchronized method and waits at that point until it can acquire a lock on LPTopGoalIterator.


Meanwhile a separate thread executing a query is doing this:

- Using LPTopGoalIterator.moveForward to move through its goals.
- Since this is a synchronized method it acquires a lock on LPTopGoalIterator
- That method synchronizes on its LPBRuleEngine and waits until it can acquire the lock. 


And at this point we're in a deadlock. 

In terms of a work around I can probably call prepare before querying the model the first time. That will likely work as long as I'm not planning to write to the model after the initial prepare since any write would invalidate the preparation and leave it prone to deadlock again. In my particular case the information is likely static so that might work. 

The longer term fix will require a change to the lock acquisition strategy. the LPTopGoalIterator.hasNext method could synchronize on the engine before calling moveForward, moveForward could become not synchronized and internally synchronize on LPBRuleEngine first and only after it acquires that synchronize on itself. Alternately the checkSafeToUpdate could stop trying to close external iterators and either throw an exception or wait and retry if it finds any open iterators. In the case I'm seeing I doubt the retry strategy would be viable since neither thread yet have a valid model. 

I suspect that the right solution may be much further up the stack. Maybe the model preparation should be done higher before it gets so deep into the SPARQL execution phase. That might allow for a cleaner locking strategy that doesn't allow multiple SPARQL evaluations to start until the model is stable. Maybe all the way up at com.hp.hpl.jena.sparql.engine.QueryExecutionBase.execConstruct()? At this point I'm just guessing because I don't know that section of the code well enough. I'm not submitting a patch because I'd like feedback from someone that knows this code well on their suggested approach. 


Here's the trace of a deadlock:

        at com.hp.hpl.jena.reasoner.rulesys.impl.LPTopGoalIterator.moveForward(LPTopGoalIterator.java:83)
        - waiting to lock <0x0000000779070940> (a com.hp.hpl.jena.reasoner.rulesys.impl.LPBRuleEngine)
        - locked <0x0000000776153ff0> (a com.hp.hpl.jena.reasoner.rulesys.impl.LPTopGoalIterator)
        at com.hp.hpl.jena.reasoner.rulesys.impl.LPTopGoalIterator.hasNext(LPTopGoalIterator.java:196)
        at com.hp.hpl.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:76)
        at com.hp.hpl.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:76)
        at com.hp.hpl.jena.util.iterator.UniqueExtendedIterator.hasNext(UniqueExtendedIterator.java:78)
        at com.hp.hpl.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:76)
        at com.hp.hpl.jena.util.iterator.FilterIterator.hasNext(FilterIterator.java:55)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:151)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:79)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterBlockTriples.hasNextBinding(QueryIterBlockTriples.java:64)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterProcessBinding.hasNextBinding(QueryIterProcessBinding.java:60)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at com.hp.hpl.jena.sparql.engine.QueryExecutionBase.execConstruct(QueryExecutionBase.java:254)
        at com.hp.hpl.jena.sparql.engine.QueryExecutionBase.execConstruct(QueryExecutionBase.java:223)


"http-8080-9":
        at com.hp.hpl.jena.reasoner.rulesys.impl.LPTopGoalIterator.close(LPTopGoalIterator.java:169)
        - waiting to lock <0x0000000776153ff0> (a com.hp.hpl.jena.reasoner.rulesys.impl.LPTopGoalIterator)
        at com.hp.hpl.jena.reasoner.rulesys.impl.LPBRuleEngine.checkSafeToUpdate(LPBRuleEngine.java:235)
        at com.hp.hpl.jena.reasoner.rulesys.impl.LPBRuleEngine.addRule(LPBRuleEngine.java:129)
        - locked <0x0000000779070940> (a com.hp.hpl.jena.reasoner.rulesys.impl.LPBRuleEngine)
        at com.hp.hpl.jena.reasoner.rulesys.FBRuleInfGraph.addBRules(FBRuleInfGraph.java:290)
        at com.hp.hpl.jena.reasoner.rulesys.FBRuleInfGraph.preloadDeductions(FBRuleInfGraph.java:903)
        at com.hp.hpl.jena.reasoner.rulesys.FBRuleInfGraph.prepare(FBRuleInfGraph.java:483)
        - locked <0x0000000778e81270> (a com.hp.hpl.jena.reasoner.rulesys.RDFSRuleInfGraph)
        at com.hp.hpl.jena.reasoner.rulesys.FBRuleInfGraph.findWithContinuation(FBRuleInfGraph.java:574)
        at com.hp.hpl.jena.reasoner.rulesys.FBRuleInfGraph.graphBaseFind(FBRuleInfGraph.java:606)
        at com.hp.hpl.jena.graph.impl.GraphBase.find(GraphBase.java:285)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.<init>(QueryIterTriplePattern.java:80)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterTriplePattern.nextStage(QueryIterTriplePattern.java:53)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:113)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:65)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:106)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:65)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterBlockTriples.hasNextBinding(QueryIterBlockTriples.java:64)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:106)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:65)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:106)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:65)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
        at com.hp.hpl.jena.sparql.engine.QueryExecutionBase.execConstruct(QueryExecutionBase.java:254)
        at com.hp.hpl.jena.sparql.engine.QueryExecutionBase.execConstruct(QueryExecutionBase.java:223)

                
> Deadlock during SPARQL execution on an inference model
> ------------------------------------------------------
>
>                 Key: JENA-244
>                 URL: https://issues.apache.org/jira/browse/JENA-244
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: Jena
>            Reporter: Stephen Owens
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira