You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Mark Buquor (JIRA)" <ji...@apache.org> on 2012/08/03 23:03:02 UTC

[jira] [Created] (JENA-289) Respect query timeouts in TDB implementation

Mark Buquor created JENA-289:
--------------------------------

             Summary: Respect query timeouts in TDB implementation
                 Key: JENA-289
                 URL: https://issues.apache.org/jira/browse/JENA-289
             Project: Apache Jena
          Issue Type: Improvement
          Components: TDB
    Affects Versions: TDB 0.9.1
            Reporter: Mark Buquor


In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.


Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.

Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.HashSet.<init>(HashSet.java:86)
    at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
    at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
    at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
    at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
    at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
    at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
    at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
    at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
    at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
    at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
    at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
    at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
    at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
    at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
    at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
    at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
    at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
    at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
    at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
    at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
    at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
    at TestTimout.main(TestTimout.java:84)
Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Simon Helsen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454932#comment-13454932 ] 

Simon Helsen commented on JENA-289:
-----------------------------------

Discussed it offline with Mark. The real problem is that we may not be able to catch all dangerous queries during testing and when we hit this issue, it brings the entire system down. I agree that the severity should be raised
                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>            Priority: Critical
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480798#comment-13480798 ] 

Andy Seaborne edited comment on JENA-289 at 10/20/12 7:29 PM:
--------------------------------------------------------------

Would it be possible to attach a profiler to an execution of this query and see where the time is going? I don't think it is the same same situation as your original example of { ?a ?b ?c . ?c ?d ?e }.

It does not seem to get much, if any optimization, so it may be that the code doing naive joins is not checking the timeout situation when it should. However, the reason isn't immediately obvious just by looking at the code so it might be elsewhere.

                
      was (Author: andy.seaborne):
    Would it be possible to attach a profiler to an execution of this query and see where the time is going? I don't hink it is the same same situation as your original example of { ?a ?b ?c . ?c ?d ?e }.

It does not seem to get much, if any optimization, so it may be that the code doing naive joins is not checking the timeout situation when it should. However, the reason isn't immediately obvious just by looking at the code so it might be elsewhere.

                  
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480798#comment-13480798 ] 

Andy Seaborne commented on JENA-289:
------------------------------------

Would it be possible to attach a profiler to an execution of this query and see where the time is going? I don't hink it is the same same situation as your original example of { ?a ?b ?c . ?c ?d ?e }.

It does not seem to get much, if any optimization, so it may be that the code doing naive joins is not checking the timeout situation when it should. However, the reason isn't immediately obvious just by looking at the code so it might be elsewhere.

                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Simon Helsen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13469678#comment-13469678 ] 

Simon Helsen commented on JENA-289:
-----------------------------------

Andy, I am not sure I understand what you mean by "crude workaround". 
                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Mark Buquor (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Buquor updated JENA-289:
-----------------------------

    Priority: Critical  (was: Major)
    
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>            Priority: Critical
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Simon Helsen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464029#comment-13464029 ] 

Simon Helsen commented on JENA-289:
-----------------------------------

The main problem we face is that may cause a production system to go on its knees. We recently had such an instance because of a mistake in how a query was written. Without a reliable abort we put ourselves at great risk. 
                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>            Priority: Critical
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429395#comment-13429395 ] 

Andy Seaborne commented on JENA-289:
------------------------------------

Thanks - test placed in TDB test collection (T_TimeoutUnionGraph)

What seems to be happening is that the query does not generate any results for a very long time (the whole DB - ?a ?b ?c), which means the testing for timeouts done in the QueryIterator hierarchy is not triggered.

The OOME is a separate issue - it's because the union default graph needs to suppress duplicates to retain RDF semantics.  It does not exploit index characteristics when it could do - indexes will return items in a known order so that duplicate suppression need only occur on sections of the index, not all of the access.
                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479165#comment-13479165 ] 

Andy Seaborne commented on JENA-289:
------------------------------------

What is the query?

The one at the start of this JIRA is not respecting timeouts for a different reason.  It is compounded by the use of the defaultUnion graph.

While inside the code to solve a BGP over a TDB dataset, TDB does not currently check for timeout during BGP - this needs fixing but carefully otherwise the cost of checking may become significant.  It is a criticial inner loop of the program.

TDB does not work over Graphs and Nodes at this point - it works on NodeIds - so the code is not directly from QueryIterator.  It is converted by encapsulating the iterator of BindingNodeId.

See SolverLib and StageMatchTuple

This report is captured in T_TimeoutUnionGraph.

queryIterator is set in QueryExecutionBase before any pattern evaluation starts - the call to .iterator in startQueryIterator is of small cost in TDB.  In SDB, because of typical JDBC settings, it is not.

                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Mark Buquor (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454928#comment-13454928 ] 

Mark Buquor commented on JENA-289:
----------------------------------

After further internal discussion, we've determined that this is a critical issue for us. While it's possible to fix pathological queries after the fact, the frequency of cases and their impact on user experience is greater than we anticipated.
                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>            Priority: Critical
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andy Seaborne updated JENA-289:
-------------------------------

    Priority: Major  (was: Critical)
    
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Simon Helsen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13469701#comment-13469701 ] 

Simon Helsen commented on JENA-289:
-----------------------------------

ok, I understand now given the comments on the jena dev mailing list. The crude workaround is that in 2.7.4, killing the process will not (likely) cause data integrity problems
                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13428637#comment-13428637 ] 

Andy Seaborne commented on JENA-289:
------------------------------------

The file TestTimeout does not have the software grant to Apache.  Therefore, while the project can look at it, we can not incorporate it into our test suite. Would it possible to grant the file?  (Ideally, with a copyright statement in the file as well.)  

See for example, PerformanceRegressionTest.java in JENA-256.

                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Simon Helsen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479126#comment-13479126 ] 

Simon Helsen commented on JENA-289:
-----------------------------------

actually, we dug deeper into the trouble query and it appears that in QueryExecutionBase, the queryIterator remains null for a very long time, which is essentially what this bug is addressing. The query is nasty and is returning too many results, so I told them this needs to be fixed. Now, I am trying to understand the code better. Obviously, plan creation is fast, by does it take so long to get the iterator object first? looking at the code path, I can't see where the time goes
                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Simon Helsen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452022#comment-13452022 ] 

Simon Helsen commented on JENA-289:
-----------------------------------

We also observed this issue in a query whose execution plan led to a pathological explosion of the search space (on a very large repository). The query was not aborting after 10 hours or something. In practice however, these are non-issues since our clients, when writing such queries can be instructed to change the query to avoid pathological execution plans.

@Jena committers: although it would be great to get a fix for this issue, it is not critical to have it in a new service release (2.7.4). I am making this statement on behalf of the reporter of this defect as well (we clarified this in private communication). [~mark.buquor@us.ibm.com] FYI
                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (JENA-289) Respect query timeouts in TDB implementation

Posted by "David Liman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Liman updated JENA-289:
-----------------------------

    Attachment: TestTimeout.java

granted the license to Apache.
                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464012#comment-13464012 ] 

Andy Seaborne commented on JENA-289:
------------------------------------

This does not corrupt data so it is hard to see it as "critical", above many other incoming JIRA.
                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>            Priority: Critical
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464755#comment-13464755 ] 

Andy Seaborne commented on JENA-289:
------------------------------------

So 2.7.4 will provide a crude workaround for now and this feature would be in 2.7.4 at the earliest.  Re-rated as "major".
                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>            Priority: Critical
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Simon Helsen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452355#comment-13452355 ] 

Simon Helsen commented on JENA-289:
-----------------------------------

Incidentally, we hit this issue again with another client. They just observed the OOME (They had 3,5Gb of data sitting in BindingTDB) and after deeper analysis found that a query was doing something really bad which caused the pathological explosion of the search space. Unlike the other client, they ran into the OME before realizing the query was never aborted.

but here as well, once can fix the query to bypass the problem
                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Mark Buquor (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Buquor updated JENA-289:
-----------------------------

    Attachment: TestTimeout.java
    
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Simon Helsen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478239#comment-13478239 ] 

Simon Helsen commented on JENA-289:
-----------------------------------

I have a question about the abort and the problem reported here. Right now, we are looking into a workaround that if a bad query occurs and abort is requested, we just keep requesting the abort (like every so many seconds) until the abort works. The reason is that from the explanation above, it seems that the abort only doesn't catch until the first result is returned (i.e. the iterator starts). Now, we have a instance of an expensive query which we try to abort after 30 seconds. That doesn't seem to work, so I keep calling QueryExecution.abort() every 5 seconds. Yet, the query keeps going and after 16 minutes, we hit the OME because the journal is exploding.  

I can't seem to find in the code why ARQ would not keep trying to abort if it is requested more than once. We will try to examine where the query is hanging during these 16 minutes, but my question was mostly to try to understand if our strategy to keep requesting abort() on the QueryExecution (in a separate thread of course), would eventually have to work
                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Simon Helsen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479184#comment-13479184 ] 

Simon Helsen commented on JENA-289:
-----------------------------------

We execute all our queries with defaultUnionGraph=true. The query in trouble is quite ugly and has the following form, but I had to obfuscate it:

PREFIX pr1: <http://host/context1>  
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX pr2: <http://host/context2> 
PREFIX pr3: <http://host/context3> 
PREFIX pr4: <http://host/context4> 
SELECT DISTINCT ?R1 ?R1_resourceContext ?R1_v1 ?R1_v2 ?R1_v3 ?R1_v4 ?R1_v5 ?R1_v6 ?R1_v7 ?R1_v8 ?R1_v9 ?R1_v10 ?R1_v11
WHERE
{ { ?R1 pr4:pred1 <https://host/context5/_SmApctudEeGqa4VgFDq7yw#Text>  }
OPTIONAL
{ ?R1 pr2:pred2 ?R1_v7 }
OPTIONAL
{ ?R1 pr1:pred3 ?R1_v10 }
OPTIONAL
{ ?R1 pr1:pred4 ?R1_v6 }
OPTIONAL
{ ?R1 pr1:pred5 ?R1_v11 }
OPTIONAL
{ ?R1 pr1:pred6 ?R1_v8 }
OPTIONAL
{ ?R1 pr3:pred7 ?R1_v12 }
FILTER ( ! bound(?R1_v12) )
OPTIONAL
{ ?R1 pr3:pred8 ?R1_v1 .
?R1_v1 pr1:pred6 ?R1_uv2
}
OPTIONAL
{ ?R1 pr3:pred9 ?R1_v2 }
OPTIONAL
{ { SELECT ?R1 (count(?b) AS ?R1_v9)
WHERE
{ ?R1 pr4:pred1 <https://host/context5/_SmApctudEeGqa4VgFDq7yw#Text> .
?b pr3:pred10 ?R1
}
GROUP BY ?R1
OFFSET 35640
LIMIT 20
}
}
OPTIONAL
{ ?R1 pr3:pred11 ?R1_v5 }
OPTIONAL
{ ?R1 pr3:pred12 ?R1_v4 }
OPTIONAL
{ ?R1 pr4:pred1 ?R1_v3 .
?R1_v3 rdf:value ?R1_uv1
}
?R1 rdf:type pr3:pred13 .
?R1 pr2:pred14 ?R1_resourceContext
}
OFFSET 35640
LIMIT 20

Note: this comes from our client and they acknowledge it is bad. But as we noted elsewhere in the issue, one bad query from one client can bring the server on its knees and one of the main use-cases for abort is to prevent a rogue query from continuing. Note that we also run into OMEs with a very large heap. This is either caused by the same reason you mention in an earlier comment (the union default graph needs to suppress duplicates) or, because this is executed in a transaction, all subsequent write activity is kept in memory eventually exploding it
                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (JENA-289) Respect query timeouts in TDB implementation

Posted by "Simon Helsen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464029#comment-13464029 ] 

Simon Helsen edited comment on JENA-289 at 9/27/12 5:48 AM:
------------------------------------------------------------

The main problem we face is that it may cause a production system to go on its knees. We recently had such an instance because of a mistake in how a query was written. Without a reliable abort we put ourselves at great risk. And I should add that during our testing efforst, we are encountering countless queries which run into this situation and refuse to abort. These queries can usually be fixed or rewritten, but it seems remarkably easy for our clients to produce such queries which require the server to be killed. The latter action then often causes corruptions because Jena 2.7.1 is not good at handling abrupt process kills. And since there is no 2.7.4 release with all these fixes...
                
      was (Author: shelsen):
    The main problem we face is that may cause a production system to go on its knees. We recently had such an instance because of a mistake in how a query was written. Without a reliable abort we put ourselves at great risk. 
                  
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>            Priority: Critical
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException several seconds/minutes after the expected timeout. This is acceptable to a degree, but it appears that there are cases where a rogue query could execute unmitigated. The attached testcase is an example of a query that will execute and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira