You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by Stephen Mallette <sp...@gmail.com> on 2016/04/18 15:11:16 UTC

[DISCUSS] Interrupting Traversals

A while back, I brought up the issue of being able to interrupt traversals:

https://pony-poc.apache.org/thread.html/e6477fc9c58d37a5bdcb5938a0eaa285456ad15aa39e16446290e2ff@1444993523@%3Cdev.tinkerpop.apache.org%3E
https://issues.apache.org/jira/browse/TINKERPOP-946

As a quick refresher, making Traversal respect Thread.interrupted() is
important as you otherwise can quite easily lock up applications like
Gremlin Server with a few poorly conceived or errant queries. We'd left
that last thread with liking the idea, but there were concerns about the
complexity of the changes and performance hits.

Given that we now have gremlin-benchmark, I decided to see what the
performance hit would be for making this change. I took a rough stab at it
introducing Thread.interrupted() in all steps where it seemed to make sense
to do so and then ran the benchmark before and after the change.

https://gist.github.com/spmallette/ed21267f2e7e17bb3fbd5a8d1a568d2b

I'm not seeing a whole lot of difference between supporting this feature
and not supporting this feature.  Here's the branch I implemented this in
in case you want to look around:

https://github.com/apache/incubator-tinkerpop/tree/TINKERPOP-946

I'm not sure that my changes are completely bulletproof at this point, but
I'm reasonably sure that these changes would handle a good majority of
calls for thread interruption. I expect to re-target my branch at tp31
(currently from master so that i could use the benchmark suite) if this
becomes a pull request.

Any thoughts on the benchmark, the implementation, etc?

Re: [DISCUSS] Interrupting Traversals

Posted by Stephen Mallette <sp...@gmail.com>.
I have been tinkering on this here and there and now have thread
interruption working for TinkerGraphComputer and SparkGraphComputer.

https://github.com/apache/incubator-tinkerpop/compare/TINKERPOP-946

The tests for interruption unfortunately needed to be different - the same
testing mechanism didn't hold between OLAP and OLTP, but I got roughly the
same coverage.  I did have to OptOut the tests for spark though. I don't
have a reliable way to block for interruption at the right time so the
tests don't really test what they are supposed to (and typically just
fail). I could probably do it with a Thread.sleep() but i sense that will
just make the test randomly fail in different environments (like Travis or
whatever).  The tests do work perfectly for TinkerGraphComputer though.

Even though I don't have unit tests, I was able to manually test
SparkGraphComputer. I started a spark cluster, submitted my job in the
console in a thread and then killed the thread once i saw the job show up
in the spark web ui - the output is a little ugly but check this out:

gremlin> t = new Thread({
gremlin> try {
gremlin>   println g.V().out().out().toList()
gremlin> } catch (Exception ex) { ex.printStackTrace() }})
==>Thread[Thread-91,5,main]
gremlin> t.start()
==>null
gremlin> t.interrupt()
==>null
gremlin>
org.apache.tinkerpop.gremlin.process.traversal.util.TraversalInterruptedException
at
org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.VertexProgramStep.processNextStart(VertexProgramStep.java:80)
at
org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
at
org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50)
at
org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.ComputerResultStep.processNextStart(ComputerResultStep.java:70)
at
org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:128)
at
org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:38)
at
org.apache.tinkerpop.gremlin.process.traversal.Traversal.fill(Traversal.java:146)
at
org.apache.tinkerpop.gremlin.process.traversal.Traversal.toList(Traversal.java:103)
at
org.apache.tinkerpop.gremlin.process.traversal.Traversal$toList.call(Unknown
Source)
at
org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at
org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at
org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:117)
at groovysh_evaluate$_run_closure1.doCall(groovysh_evaluate:5)
at groovysh_evaluate$_run_closure1.doCall(groovysh_evaluate)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at
org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:294)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1021)
at groovy.lang.Closure.call(Closure.java:426)
at groovy.lang.Closure.call(Closure.java:420)
at groovy.lang.Closure.run(Closure.java:507)
at java.lang.Thread.run(Thread.java:745)

We get the TraversalInterruptedException as expected and best of all I
watched the running job status in Spark web ui go to "KILLED" - so the
local interruption in the console cleaned up the job on the spark cluster!
That little bit of magic happens here:

https://github.com/apache/incubator-tinkerpop/compare/TINKERPOP-946#diff-91e2c8dbe3d9312fea7aec8291c953a8R251

Marko, perhaps you can suggest some other places in that spark code that
might have long run iterations where we might want to check for thread
interruption. I only added that one (but maybe that's the only place we
want interrupt to occur - i don't know the code well enough to say).

Another thing I'd point out, is that the GraphComputer instances now use a
single thread executor to supply the Future<ComputerResult> from submit()
(instead of CompletableFuture.supplyAsync() which used ForkJoinPool):

https://github.com/apache/incubator-tinkerpop/compare/TINKERPOP-946#diff-7598ae12d354f03e698cb5319371fff1R75

That was important as use of ForkJoinPool doesn't seem to create futures
that respect cancellation.

What else needs to happen on this ticket? I guess GiraphGraphComputer needs
similar capability somehow? can giraph jobs be cancelled? I do think that
for OLAP, interruption can be a best effort kind of thing. It may not
always be easy to support this capability in all cases.



On Wed, Apr 20, 2016 at 8:03 AM, Marko Rodriguez <ok...@gmail.com>
wrote:

> Dope.
>
> Wanna see if TinkerGraphComputer and SparkGraphComputer "just work" (i.e.
> "just fail" :) ?
>
> Marko.
>
> http://markorodriguez.com
>
> On Apr 20, 2016, at 5:59 AM, Stephen Mallette <sp...@gmail.com>
> wrote:
>
> > I refactored a bit and built parameterized tests, so we have better
> > coverage over the different variations in the steps and can easily add
> new
> > ones:
> >
> >
> https://github.com/apache/incubator-tinkerpop/blob/7a9f7de6ebed5d4293708524c772db0f0b2c3bac/gremlin-test/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/TraversalInterruptionTest.java
> >
> > I've run the tests for TinkerGraph and Neo4j - both pass nicely.
> >
> > On Tue, Apr 19, 2016 at 7:13 PM, Stephen Mallette <sp...@gmail.com>
> > wrote:
> >
> >>> It can be done, but would require the provider implementations to
> >> change as they would need to message a "job kill."
> >>> Again, one thread is interrupted, but what about all the other threads…
> >> Its not that its "not possible," just that we will have design around
> this
> >> like for OLAP above.
> >>
> >> Agreed - should be possible - just something to think about as we move
> >> forward.
> >>
> >>> We should also add test cases to VertexStep, PropertyStep, GraphStep,
> >> etc. that use interrupt
> >>
> >> I added some more test cases to cover proper interrupt semantics for
> >> implementations. It took some thinking to come up with a way to test
> steps
> >> other than GraphStep though. To test, you have to start iteration of a
> >> Traversal in one thread and interrupt it in another. It's a bit
> imperfect
> >> because it's not immediately obvious how to ensure that my call to
> >> interrupt the thread will trigger a TraversalInterruptedException in a
> >> specific step. In other words, i can test:
> >>
> >> g.V()
> >>
> >> because that will use some variation of GraphStep and that's all there
> is,
> >> but when you have:
> >>
> >> g.V().out()
> >>
> >> I won't know for sure if if the test passes because the thread may have
> >> interrupted in GraphStep or VertexStep. Anyway, I came up with a way to
> do
> >> it with what I think is a clever use of sideEffect() to block at the
> right
> >> point and hold the traversal to force it to fail on the right step.  You
> >> can see that work here:
> >>
> >>
> >>
> https://github.com/apache/incubator-tinkerpop/commit/fd16fabd7595470bd33f30b882b1f0297d05b55a
> >>
> >> I covered VertexStep, PropertyStep, GraphStep as you suggested. I didn't
> >> do a lot of variations on them (e.g. didn't do g.E()). Any thoughts on
> how
> >> much coverage is "right" for this?
> >>
> >> as a side note, I was going to retarget the branch at tp31 but I'm
> >> starting to feel like this change is sufficiently big a feature that it
> >> should probably exist on the 3.2.x line where it can live with the
> >> benchmarks.
> >>
> >> On Mon, Apr 18, 2016 at 12:06 PM, Marko Rodriguez <okrammarko@gmail.com
> >
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>>> Well - I don't think this code is highly specialized. It's good
> general
> >>>> practice to respect Thread.interrupted(). I think you'd find that
> >>> sentiment
> >>>> in just about any java concurrency programming book.  …
> >>>
> >>> Huh, I didn't know this was a "standard pattern." If so, cool. Then
> that
> >>> solves that.
> >>>
> >>>>> 1. In OLAP, where there can be multiple threads how does this work?
> >>>>> 2. In Giraph/Spark, how does this effect job execution and failure
> >>>> responses?
> >>>>
> >>>> I ended my initial post in this thread by deleting the last paragraph
> i
> >>>> wrote about OLAP.  :)  I guess there's still some question there as to
> >>> how
> >>>> that will work. If I interrupt the thread that was executing the OLAP
> >>>> traversal, it's only going to kill it waiting for the result from
> Spark
> >>> or
> >>>> wherever.  The traversal will still be executing in the context of
> >>> spark.
> >>>> I assume the way to deal with this is on an implementation specific
> >>> basis
> >>>> where I assume there is a way to cancel a running spark job (or
> running
> >>>> giraph job or whatever). If the Traversal that waits for interrupt
> could
> >>>> signal that cancellation somehow, i guess that would be the way to
> >>>> implement that. I don't know enough about the specifics of spark for
> how
> >>>> that would work but it sounds plausible, no?
> >>>
> >>> Yea, I don't know how this would work either as it would be
> master/slave
> >>> traversals needing to coordinate. It can be done, but would require the
> >>> provider implementations to change as they would need to message a "job
> >>> kill." We could add test cases to GraphComputerTest that ensure that
> all
> >>> OLAP engines handle such interrupts correctly. We should also add test
> >>> cases to VertexStep, PropertyStep, GraphStep, etc. that use interrupt
> as
> >>> these are the steps that most providers will implement/extend and we
> need
> >>> to ensure they are doing the interruptions correctly.
> >>>
> >>>>> 3. When we move into threaded OLTP, how will this be
> >>> triggered/effected?
> >>>>
> >>>> I'm not sure how that feature will be implemented - so i'm not sure
> how
> >>> to
> >>>> comment on that.
> >>>
> >>> It would be similar to OLAP, where you have a master traversal and
> >>> slave/parallel traversals. Again, one thread is interrupted, but what
> about
> >>> all the other threads… Its not that its "not possible," just that we
> will
> >>> have design around this like for OLAP above.
> >>>
> >>> Marko.
> >>>
> >>>
> >>>> On Mon, Apr 18, 2016 at 11:28 AM, Marko Rodriguez <
> okrammarko@gmail.com
> >>>>
> >>>> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> I think a problem with this is that it requires every step
> >>> implementation
> >>>>> to have this construct in it -- though many steps simply extend the
> >>> base
> >>>>> FlatMapStep, MapStep, FilterStep, etc. However, not all and thus,
> this
> >>>>> requires all providers to know what this about and write their code
> >>>>> accordingly.
> >>>>>
> >>>>> A few questions:
> >>>>>
> >>>>>       1. In OLAP, where there can be multiple threads how does this
> >>> work?
> >>>>>       2. In Giraph/Spark, how does this effect job execution and
> >>> failure
> >>>>> responses?
> >>>>>       3. When we move into threaded OLTP, how will this be
> >>>>> triggered/effected?
> >>>>>       4. This doesn't work for "infinite loop" lambdas or "hung
> >>>>> databases."
> >>>>>
> >>>>> I know this is the oldest ticket in the books and a million solutions
> >>> have
> >>>>> been proposed, but it would be nice if this didn't require
> specialized
> >>> code
> >>>>> in all the steps. We are bound to "forget."
> >>>>>
> >>>>> Thanks,
> >>>>> Marko.
> >>>>>
> >>>>> http://markorodriguez.com
> >>>>>
> >>>>> On Apr 18, 2016, at 8:56 AM, Stephen Mallette <sp...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Do you mean:
> >>>>>>
> >>>>>> if (Thread.currentThread().isInterrupted()) throw new
> >>>>>> TraversalInterruptedException();
> >>>>>>
> >>>>>> If so, Thread.interrupted() basically does that under the covers
> >>>>>>
> >>>>>> On Mon, Apr 18, 2016 at 10:51 AM, Ted Wilmes <tw...@gmail.com>
> >>> wrote:
> >>>>>>
> >>>>>>> Yeah, looks like benchmark-wise it's a wash, which is good.  I
> wasn't
> >>>>> aware
> >>>>>>> of the difference between the static interrupted() and non-static
> >>>>>>> isInterrupted().  I was wondering if in this case it should be
> >>>>>>> isInterrupted(), but I think how you did it is good because it'll
> be
> >>>>>>> evaluated within the traversal thread regardless.
> >>>>>>>
> >>>>>>> --Ted
> >>>>>>>
> >>>>>>> On Mon, Apr 18, 2016 at 6:11 AM, Stephen Mallette <
> >>> spmallette@gmail.com
> >>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> A while back, I brought up the issue of being able to interrupt
> >>>>>>> traversals:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> https://pony-poc.apache.org/thread.html/e6477fc9c58d37a5bdcb5938a0eaa285456ad15aa39e16446290e2ff@1444993523@%3Cdev.tinkerpop.apache.org%3E
> >>>>>>>> https://issues.apache.org/jira/browse/TINKERPOP-946
> >>>>>>>>
> >>>>>>>> As a quick refresher, making Traversal respect
> Thread.interrupted()
> >>> is
> >>>>>>>> important as you otherwise can quite easily lock up applications
> >>> like
> >>>>>>>> Gremlin Server with a few poorly conceived or errant queries. We'd
> >>> left
> >>>>>>>> that last thread with liking the idea, but there were concerns
> about
> >>>>> the
> >>>>>>>> complexity of the changes and performance hits.
> >>>>>>>>
> >>>>>>>> Given that we now have gremlin-benchmark, I decided to see what
> the
> >>>>>>>> performance hit would be for making this change. I took a rough
> >>> stab at
> >>>>>>> it
> >>>>>>>> introducing Thread.interrupted() in all steps where it seemed to
> >>> make
> >>>>>>> sense
> >>>>>>>> to do so and then ran the benchmark before and after the change.
> >>>>>>>>
> >>>>>>>>
> https://gist.github.com/spmallette/ed21267f2e7e17bb3fbd5a8d1a568d2b
> >>>>>>>>
> >>>>>>>> I'm not seeing a whole lot of difference between supporting this
> >>>>> feature
> >>>>>>>> and not supporting this feature.  Here's the branch I implemented
> >>> this
> >>>>> in
> >>>>>>>> in case you want to look around:
> >>>>>>>>
> >>>>>>>> https://github.com/apache/incubator-tinkerpop/tree/TINKERPOP-946
> >>>>>>>>
> >>>>>>>> I'm not sure that my changes are completely bulletproof at this
> >>> point,
> >>>>>>> but
> >>>>>>>> I'm reasonably sure that these changes would handle a good
> majority
> >>> of
> >>>>>>>> calls for thread interruption. I expect to re-target my branch at
> >>> tp31
> >>>>>>>> (currently from master so that i could use the benchmark suite) if
> >>> this
> >>>>>>>> becomes a pull request.
> >>>>>>>>
> >>>>>>>> Any thoughts on the benchmark, the implementation, etc?
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>>
> >>>
> >>
>
>

Re: [DISCUSS] Interrupting Traversals

Posted by Marko Rodriguez <ok...@gmail.com>.
Dope.

Wanna see if TinkerGraphComputer and SparkGraphComputer "just work" (i.e. "just fail" :) ?

Marko.

http://markorodriguez.com

On Apr 20, 2016, at 5:59 AM, Stephen Mallette <sp...@gmail.com> wrote:

> I refactored a bit and built parameterized tests, so we have better
> coverage over the different variations in the steps and can easily add new
> ones:
> 
> https://github.com/apache/incubator-tinkerpop/blob/7a9f7de6ebed5d4293708524c772db0f0b2c3bac/gremlin-test/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/TraversalInterruptionTest.java
> 
> I've run the tests for TinkerGraph and Neo4j - both pass nicely.
> 
> On Tue, Apr 19, 2016 at 7:13 PM, Stephen Mallette <sp...@gmail.com>
> wrote:
> 
>>> It can be done, but would require the provider implementations to
>> change as they would need to message a "job kill."
>>> Again, one thread is interrupted, but what about all the other threads…
>> Its not that its "not possible," just that we will have design around this
>> like for OLAP above.
>> 
>> Agreed - should be possible - just something to think about as we move
>> forward.
>> 
>>> We should also add test cases to VertexStep, PropertyStep, GraphStep,
>> etc. that use interrupt
>> 
>> I added some more test cases to cover proper interrupt semantics for
>> implementations. It took some thinking to come up with a way to test steps
>> other than GraphStep though. To test, you have to start iteration of a
>> Traversal in one thread and interrupt it in another. It's a bit imperfect
>> because it's not immediately obvious how to ensure that my call to
>> interrupt the thread will trigger a TraversalInterruptedException in a
>> specific step. In other words, i can test:
>> 
>> g.V()
>> 
>> because that will use some variation of GraphStep and that's all there is,
>> but when you have:
>> 
>> g.V().out()
>> 
>> I won't know for sure if if the test passes because the thread may have
>> interrupted in GraphStep or VertexStep. Anyway, I came up with a way to do
>> it with what I think is a clever use of sideEffect() to block at the right
>> point and hold the traversal to force it to fail on the right step.  You
>> can see that work here:
>> 
>> 
>> https://github.com/apache/incubator-tinkerpop/commit/fd16fabd7595470bd33f30b882b1f0297d05b55a
>> 
>> I covered VertexStep, PropertyStep, GraphStep as you suggested. I didn't
>> do a lot of variations on them (e.g. didn't do g.E()). Any thoughts on how
>> much coverage is "right" for this?
>> 
>> as a side note, I was going to retarget the branch at tp31 but I'm
>> starting to feel like this change is sufficiently big a feature that it
>> should probably exist on the 3.2.x line where it can live with the
>> benchmarks.
>> 
>> On Mon, Apr 18, 2016 at 12:06 PM, Marko Rodriguez <ok...@gmail.com>
>> wrote:
>> 
>>> Hi,
>>> 
>>>> Well - I don't think this code is highly specialized. It's good general
>>>> practice to respect Thread.interrupted(). I think you'd find that
>>> sentiment
>>>> in just about any java concurrency programming book.  …
>>> 
>>> Huh, I didn't know this was a "standard pattern." If so, cool. Then that
>>> solves that.
>>> 
>>>>> 1. In OLAP, where there can be multiple threads how does this work?
>>>>> 2. In Giraph/Spark, how does this effect job execution and failure
>>>> responses?
>>>> 
>>>> I ended my initial post in this thread by deleting the last paragraph i
>>>> wrote about OLAP.  :)  I guess there's still some question there as to
>>> how
>>>> that will work. If I interrupt the thread that was executing the OLAP
>>>> traversal, it's only going to kill it waiting for the result from Spark
>>> or
>>>> wherever.  The traversal will still be executing in the context of
>>> spark.
>>>> I assume the way to deal with this is on an implementation specific
>>> basis
>>>> where I assume there is a way to cancel a running spark job (or running
>>>> giraph job or whatever). If the Traversal that waits for interrupt could
>>>> signal that cancellation somehow, i guess that would be the way to
>>>> implement that. I don't know enough about the specifics of spark for how
>>>> that would work but it sounds plausible, no?
>>> 
>>> Yea, I don't know how this would work either as it would be master/slave
>>> traversals needing to coordinate. It can be done, but would require the
>>> provider implementations to change as they would need to message a "job
>>> kill." We could add test cases to GraphComputerTest that ensure that all
>>> OLAP engines handle such interrupts correctly. We should also add test
>>> cases to VertexStep, PropertyStep, GraphStep, etc. that use interrupt as
>>> these are the steps that most providers will implement/extend and we need
>>> to ensure they are doing the interruptions correctly.
>>> 
>>>>> 3. When we move into threaded OLTP, how will this be
>>> triggered/effected?
>>>> 
>>>> I'm not sure how that feature will be implemented - so i'm not sure how
>>> to
>>>> comment on that.
>>> 
>>> It would be similar to OLAP, where you have a master traversal and
>>> slave/parallel traversals. Again, one thread is interrupted, but what about
>>> all the other threads… Its not that its "not possible," just that we will
>>> have design around this like for OLAP above.
>>> 
>>> Marko.
>>> 
>>> 
>>>> On Mon, Apr 18, 2016 at 11:28 AM, Marko Rodriguez <okrammarko@gmail.com
>>>> 
>>>> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I think a problem with this is that it requires every step
>>> implementation
>>>>> to have this construct in it -- though many steps simply extend the
>>> base
>>>>> FlatMapStep, MapStep, FilterStep, etc. However, not all and thus, this
>>>>> requires all providers to know what this about and write their code
>>>>> accordingly.
>>>>> 
>>>>> A few questions:
>>>>> 
>>>>>       1. In OLAP, where there can be multiple threads how does this
>>> work?
>>>>>       2. In Giraph/Spark, how does this effect job execution and
>>> failure
>>>>> responses?
>>>>>       3. When we move into threaded OLTP, how will this be
>>>>> triggered/effected?
>>>>>       4. This doesn't work for "infinite loop" lambdas or "hung
>>>>> databases."
>>>>> 
>>>>> I know this is the oldest ticket in the books and a million solutions
>>> have
>>>>> been proposed, but it would be nice if this didn't require specialized
>>> code
>>>>> in all the steps. We are bound to "forget."
>>>>> 
>>>>> Thanks,
>>>>> Marko.
>>>>> 
>>>>> http://markorodriguez.com
>>>>> 
>>>>> On Apr 18, 2016, at 8:56 AM, Stephen Mallette <sp...@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Do you mean:
>>>>>> 
>>>>>> if (Thread.currentThread().isInterrupted()) throw new
>>>>>> TraversalInterruptedException();
>>>>>> 
>>>>>> If so, Thread.interrupted() basically does that under the covers
>>>>>> 
>>>>>> On Mon, Apr 18, 2016 at 10:51 AM, Ted Wilmes <tw...@gmail.com>
>>> wrote:
>>>>>> 
>>>>>>> Yeah, looks like benchmark-wise it's a wash, which is good.  I wasn't
>>>>> aware
>>>>>>> of the difference between the static interrupted() and non-static
>>>>>>> isInterrupted().  I was wondering if in this case it should be
>>>>>>> isInterrupted(), but I think how you did it is good because it'll be
>>>>>>> evaluated within the traversal thread regardless.
>>>>>>> 
>>>>>>> --Ted
>>>>>>> 
>>>>>>> On Mon, Apr 18, 2016 at 6:11 AM, Stephen Mallette <
>>> spmallette@gmail.com
>>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> A while back, I brought up the issue of being able to interrupt
>>>>>>> traversals:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> https://pony-poc.apache.org/thread.html/e6477fc9c58d37a5bdcb5938a0eaa285456ad15aa39e16446290e2ff@1444993523@%3Cdev.tinkerpop.apache.org%3E
>>>>>>>> https://issues.apache.org/jira/browse/TINKERPOP-946
>>>>>>>> 
>>>>>>>> As a quick refresher, making Traversal respect Thread.interrupted()
>>> is
>>>>>>>> important as you otherwise can quite easily lock up applications
>>> like
>>>>>>>> Gremlin Server with a few poorly conceived or errant queries. We'd
>>> left
>>>>>>>> that last thread with liking the idea, but there were concerns about
>>>>> the
>>>>>>>> complexity of the changes and performance hits.
>>>>>>>> 
>>>>>>>> Given that we now have gremlin-benchmark, I decided to see what the
>>>>>>>> performance hit would be for making this change. I took a rough
>>> stab at
>>>>>>> it
>>>>>>>> introducing Thread.interrupted() in all steps where it seemed to
>>> make
>>>>>>> sense
>>>>>>>> to do so and then ran the benchmark before and after the change.
>>>>>>>> 
>>>>>>>> https://gist.github.com/spmallette/ed21267f2e7e17bb3fbd5a8d1a568d2b
>>>>>>>> 
>>>>>>>> I'm not seeing a whole lot of difference between supporting this
>>>>> feature
>>>>>>>> and not supporting this feature.  Here's the branch I implemented
>>> this
>>>>> in
>>>>>>>> in case you want to look around:
>>>>>>>> 
>>>>>>>> https://github.com/apache/incubator-tinkerpop/tree/TINKERPOP-946
>>>>>>>> 
>>>>>>>> I'm not sure that my changes are completely bulletproof at this
>>> point,
>>>>>>> but
>>>>>>>> I'm reasonably sure that these changes would handle a good majority
>>> of
>>>>>>>> calls for thread interruption. I expect to re-target my branch at
>>> tp31
>>>>>>>> (currently from master so that i could use the benchmark suite) if
>>> this
>>>>>>>> becomes a pull request.
>>>>>>>> 
>>>>>>>> Any thoughts on the benchmark, the implementation, etc?
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>>> 
>>> 
>> 


Re: [DISCUSS] Interrupting Traversals

Posted by Stephen Mallette <sp...@gmail.com>.
I refactored a bit and built parameterized tests, so we have better
coverage over the different variations in the steps and can easily add new
ones:

https://github.com/apache/incubator-tinkerpop/blob/7a9f7de6ebed5d4293708524c772db0f0b2c3bac/gremlin-test/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/TraversalInterruptionTest.java

I've run the tests for TinkerGraph and Neo4j - both pass nicely.

On Tue, Apr 19, 2016 at 7:13 PM, Stephen Mallette <sp...@gmail.com>
wrote:

> > It can be done, but would require the provider implementations to
> change as they would need to message a "job kill."
> > Again, one thread is interrupted, but what about all the other threads…
> Its not that its "not possible," just that we will have design around this
> like for OLAP above.
>
> Agreed - should be possible - just something to think about as we move
> forward.
>
> > We should also add test cases to VertexStep, PropertyStep, GraphStep,
> etc. that use interrupt
>
> I added some more test cases to cover proper interrupt semantics for
> implementations. It took some thinking to come up with a way to test steps
> other than GraphStep though. To test, you have to start iteration of a
> Traversal in one thread and interrupt it in another. It's a bit imperfect
> because it's not immediately obvious how to ensure that my call to
> interrupt the thread will trigger a TraversalInterruptedException in a
> specific step. In other words, i can test:
>
> g.V()
>
> because that will use some variation of GraphStep and that's all there is,
> but when you have:
>
> g.V().out()
>
> I won't know for sure if if the test passes because the thread may have
> interrupted in GraphStep or VertexStep. Anyway, I came up with a way to do
> it with what I think is a clever use of sideEffect() to block at the right
> point and hold the traversal to force it to fail on the right step.  You
> can see that work here:
>
>
> https://github.com/apache/incubator-tinkerpop/commit/fd16fabd7595470bd33f30b882b1f0297d05b55a
>
> I covered VertexStep, PropertyStep, GraphStep as you suggested. I didn't
> do a lot of variations on them (e.g. didn't do g.E()). Any thoughts on how
> much coverage is "right" for this?
>
> as a side note, I was going to retarget the branch at tp31 but I'm
> starting to feel like this change is sufficiently big a feature that it
> should probably exist on the 3.2.x line where it can live with the
> benchmarks.
>
> On Mon, Apr 18, 2016 at 12:06 PM, Marko Rodriguez <ok...@gmail.com>
> wrote:
>
>> Hi,
>>
>> > Well - I don't think this code is highly specialized. It's good general
>> > practice to respect Thread.interrupted(). I think you'd find that
>> sentiment
>> > in just about any java concurrency programming book.  …
>>
>> Huh, I didn't know this was a "standard pattern." If so, cool. Then that
>> solves that.
>>
>> >> 1. In OLAP, where there can be multiple threads how does this work?
>> >> 2. In Giraph/Spark, how does this effect job execution and failure
>> > responses?
>> >
>> > I ended my initial post in this thread by deleting the last paragraph i
>> > wrote about OLAP.  :)  I guess there's still some question there as to
>> how
>> > that will work. If I interrupt the thread that was executing the OLAP
>> > traversal, it's only going to kill it waiting for the result from Spark
>> or
>> > wherever.  The traversal will still be executing in the context of
>> spark.
>> > I assume the way to deal with this is on an implementation specific
>> basis
>> > where I assume there is a way to cancel a running spark job (or running
>> > giraph job or whatever). If the Traversal that waits for interrupt could
>> > signal that cancellation somehow, i guess that would be the way to
>> > implement that. I don't know enough about the specifics of spark for how
>> > that would work but it sounds plausible, no?
>>
>> Yea, I don't know how this would work either as it would be master/slave
>> traversals needing to coordinate. It can be done, but would require the
>> provider implementations to change as they would need to message a "job
>> kill." We could add test cases to GraphComputerTest that ensure that all
>> OLAP engines handle such interrupts correctly. We should also add test
>> cases to VertexStep, PropertyStep, GraphStep, etc. that use interrupt as
>> these are the steps that most providers will implement/extend and we need
>> to ensure they are doing the interruptions correctly.
>>
>> >> 3. When we move into threaded OLTP, how will this be
>> triggered/effected?
>> >
>> > I'm not sure how that feature will be implemented - so i'm not sure how
>> to
>> > comment on that.
>>
>> It would be similar to OLAP, where you have a master traversal and
>> slave/parallel traversals. Again, one thread is interrupted, but what about
>> all the other threads… Its not that its "not possible," just that we will
>> have design around this like for OLAP above.
>>
>> Marko.
>>
>>
>> > On Mon, Apr 18, 2016 at 11:28 AM, Marko Rodriguez <okrammarko@gmail.com
>> >
>> > wrote:
>> >
>> >> Hi,
>> >>
>> >> I think a problem with this is that it requires every step
>> implementation
>> >> to have this construct in it -- though many steps simply extend the
>> base
>> >> FlatMapStep, MapStep, FilterStep, etc. However, not all and thus, this
>> >> requires all providers to know what this about and write their code
>> >> accordingly.
>> >>
>> >> A few questions:
>> >>
>> >>        1. In OLAP, where there can be multiple threads how does this
>> work?
>> >>        2. In Giraph/Spark, how does this effect job execution and
>> failure
>> >> responses?
>> >>        3. When we move into threaded OLTP, how will this be
>> >> triggered/effected?
>> >>        4. This doesn't work for "infinite loop" lambdas or "hung
>> >> databases."
>> >>
>> >> I know this is the oldest ticket in the books and a million solutions
>> have
>> >> been proposed, but it would be nice if this didn't require specialized
>> code
>> >> in all the steps. We are bound to "forget."
>> >>
>> >> Thanks,
>> >> Marko.
>> >>
>> >> http://markorodriguez.com
>> >>
>> >> On Apr 18, 2016, at 8:56 AM, Stephen Mallette <sp...@gmail.com>
>> >> wrote:
>> >>
>> >>> Do you mean:
>> >>>
>> >>> if (Thread.currentThread().isInterrupted()) throw new
>> >>> TraversalInterruptedException();
>> >>>
>> >>> If so, Thread.interrupted() basically does that under the covers
>> >>>
>> >>> On Mon, Apr 18, 2016 at 10:51 AM, Ted Wilmes <tw...@gmail.com>
>> wrote:
>> >>>
>> >>>> Yeah, looks like benchmark-wise it's a wash, which is good.  I wasn't
>> >> aware
>> >>>> of the difference between the static interrupted() and non-static
>> >>>> isInterrupted().  I was wondering if in this case it should be
>> >>>> isInterrupted(), but I think how you did it is good because it'll be
>> >>>> evaluated within the traversal thread regardless.
>> >>>>
>> >>>> --Ted
>> >>>>
>> >>>> On Mon, Apr 18, 2016 at 6:11 AM, Stephen Mallette <
>> spmallette@gmail.com
>> >>>
>> >>>> wrote:
>> >>>>
>> >>>>> A while back, I brought up the issue of being able to interrupt
>> >>>> traversals:
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>
>> >>
>> https://pony-poc.apache.org/thread.html/e6477fc9c58d37a5bdcb5938a0eaa285456ad15aa39e16446290e2ff@1444993523@%3Cdev.tinkerpop.apache.org%3E
>> >>>>> https://issues.apache.org/jira/browse/TINKERPOP-946
>> >>>>>
>> >>>>> As a quick refresher, making Traversal respect Thread.interrupted()
>> is
>> >>>>> important as you otherwise can quite easily lock up applications
>> like
>> >>>>> Gremlin Server with a few poorly conceived or errant queries. We'd
>> left
>> >>>>> that last thread with liking the idea, but there were concerns about
>> >> the
>> >>>>> complexity of the changes and performance hits.
>> >>>>>
>> >>>>> Given that we now have gremlin-benchmark, I decided to see what the
>> >>>>> performance hit would be for making this change. I took a rough
>> stab at
>> >>>> it
>> >>>>> introducing Thread.interrupted() in all steps where it seemed to
>> make
>> >>>> sense
>> >>>>> to do so and then ran the benchmark before and after the change.
>> >>>>>
>> >>>>> https://gist.github.com/spmallette/ed21267f2e7e17bb3fbd5a8d1a568d2b
>> >>>>>
>> >>>>> I'm not seeing a whole lot of difference between supporting this
>> >> feature
>> >>>>> and not supporting this feature.  Here's the branch I implemented
>> this
>> >> in
>> >>>>> in case you want to look around:
>> >>>>>
>> >>>>> https://github.com/apache/incubator-tinkerpop/tree/TINKERPOP-946
>> >>>>>
>> >>>>> I'm not sure that my changes are completely bulletproof at this
>> point,
>> >>>> but
>> >>>>> I'm reasonably sure that these changes would handle a good majority
>> of
>> >>>>> calls for thread interruption. I expect to re-target my branch at
>> tp31
>> >>>>> (currently from master so that i could use the benchmark suite) if
>> this
>> >>>>> becomes a pull request.
>> >>>>>
>> >>>>> Any thoughts on the benchmark, the implementation, etc?
>> >>>>>
>> >>>>
>> >>
>> >>
>>
>>
>

Re: [DISCUSS] Interrupting Traversals

Posted by Stephen Mallette <sp...@gmail.com>.
> It can be done, but would require the provider implementations to change
as they would need to message a "job kill."
> Again, one thread is interrupted, but what about all the other threads…
Its not that its "not possible," just that we will have design around this
like for OLAP above.

Agreed - should be possible - just something to think about as we move
forward.

> We should also add test cases to VertexStep, PropertyStep, GraphStep,
etc. that use interrupt

I added some more test cases to cover proper interrupt semantics for
implementations. It took some thinking to come up with a way to test steps
other than GraphStep though. To test, you have to start iteration of a
Traversal in one thread and interrupt it in another. It's a bit imperfect
because it's not immediately obvious how to ensure that my call to
interrupt the thread will trigger a TraversalInterruptedException in a
specific step. In other words, i can test:

g.V()

because that will use some variation of GraphStep and that's all there is,
but when you have:

g.V().out()

I won't know for sure if if the test passes because the thread may have
interrupted in GraphStep or VertexStep. Anyway, I came up with a way to do
it with what I think is a clever use of sideEffect() to block at the right
point and hold the traversal to force it to fail on the right step.  You
can see that work here:

https://github.com/apache/incubator-tinkerpop/commit/fd16fabd7595470bd33f30b882b1f0297d05b55a

I covered VertexStep, PropertyStep, GraphStep as you suggested. I didn't do
a lot of variations on them (e.g. didn't do g.E()). Any thoughts on how
much coverage is "right" for this?

as a side note, I was going to retarget the branch at tp31 but I'm starting
to feel like this change is sufficiently big a feature that it should
probably exist on the 3.2.x line where it can live with the benchmarks.

On Mon, Apr 18, 2016 at 12:06 PM, Marko Rodriguez <ok...@gmail.com>
wrote:

> Hi,
>
> > Well - I don't think this code is highly specialized. It's good general
> > practice to respect Thread.interrupted(). I think you'd find that
> sentiment
> > in just about any java concurrency programming book.  …
>
> Huh, I didn't know this was a "standard pattern." If so, cool. Then that
> solves that.
>
> >> 1. In OLAP, where there can be multiple threads how does this work?
> >> 2. In Giraph/Spark, how does this effect job execution and failure
> > responses?
> >
> > I ended my initial post in this thread by deleting the last paragraph i
> > wrote about OLAP.  :)  I guess there's still some question there as to
> how
> > that will work. If I interrupt the thread that was executing the OLAP
> > traversal, it's only going to kill it waiting for the result from Spark
> or
> > wherever.  The traversal will still be executing in the context of spark.
> > I assume the way to deal with this is on an implementation specific basis
> > where I assume there is a way to cancel a running spark job (or running
> > giraph job or whatever). If the Traversal that waits for interrupt could
> > signal that cancellation somehow, i guess that would be the way to
> > implement that. I don't know enough about the specifics of spark for how
> > that would work but it sounds plausible, no?
>
> Yea, I don't know how this would work either as it would be master/slave
> traversals needing to coordinate. It can be done, but would require the
> provider implementations to change as they would need to message a "job
> kill." We could add test cases to GraphComputerTest that ensure that all
> OLAP engines handle such interrupts correctly. We should also add test
> cases to VertexStep, PropertyStep, GraphStep, etc. that use interrupt as
> these are the steps that most providers will implement/extend and we need
> to ensure they are doing the interruptions correctly.
>
> >> 3. When we move into threaded OLTP, how will this be triggered/effected?
> >
> > I'm not sure how that feature will be implemented - so i'm not sure how
> to
> > comment on that.
>
> It would be similar to OLAP, where you have a master traversal and
> slave/parallel traversals. Again, one thread is interrupted, but what about
> all the other threads… Its not that its "not possible," just that we will
> have design around this like for OLAP above.
>
> Marko.
>
>
> > On Mon, Apr 18, 2016 at 11:28 AM, Marko Rodriguez <ok...@gmail.com>
> > wrote:
> >
> >> Hi,
> >>
> >> I think a problem with this is that it requires every step
> implementation
> >> to have this construct in it -- though many steps simply extend the base
> >> FlatMapStep, MapStep, FilterStep, etc. However, not all and thus, this
> >> requires all providers to know what this about and write their code
> >> accordingly.
> >>
> >> A few questions:
> >>
> >>        1. In OLAP, where there can be multiple threads how does this
> work?
> >>        2. In Giraph/Spark, how does this effect job execution and
> failure
> >> responses?
> >>        3. When we move into threaded OLTP, how will this be
> >> triggered/effected?
> >>        4. This doesn't work for "infinite loop" lambdas or "hung
> >> databases."
> >>
> >> I know this is the oldest ticket in the books and a million solutions
> have
> >> been proposed, but it would be nice if this didn't require specialized
> code
> >> in all the steps. We are bound to "forget."
> >>
> >> Thanks,
> >> Marko.
> >>
> >> http://markorodriguez.com
> >>
> >> On Apr 18, 2016, at 8:56 AM, Stephen Mallette <sp...@gmail.com>
> >> wrote:
> >>
> >>> Do you mean:
> >>>
> >>> if (Thread.currentThread().isInterrupted()) throw new
> >>> TraversalInterruptedException();
> >>>
> >>> If so, Thread.interrupted() basically does that under the covers
> >>>
> >>> On Mon, Apr 18, 2016 at 10:51 AM, Ted Wilmes <tw...@gmail.com>
> wrote:
> >>>
> >>>> Yeah, looks like benchmark-wise it's a wash, which is good.  I wasn't
> >> aware
> >>>> of the difference between the static interrupted() and non-static
> >>>> isInterrupted().  I was wondering if in this case it should be
> >>>> isInterrupted(), but I think how you did it is good because it'll be
> >>>> evaluated within the traversal thread regardless.
> >>>>
> >>>> --Ted
> >>>>
> >>>> On Mon, Apr 18, 2016 at 6:11 AM, Stephen Mallette <
> spmallette@gmail.com
> >>>
> >>>> wrote:
> >>>>
> >>>>> A while back, I brought up the issue of being able to interrupt
> >>>> traversals:
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>
> https://pony-poc.apache.org/thread.html/e6477fc9c58d37a5bdcb5938a0eaa285456ad15aa39e16446290e2ff@1444993523@%3Cdev.tinkerpop.apache.org%3E
> >>>>> https://issues.apache.org/jira/browse/TINKERPOP-946
> >>>>>
> >>>>> As a quick refresher, making Traversal respect Thread.interrupted()
> is
> >>>>> important as you otherwise can quite easily lock up applications like
> >>>>> Gremlin Server with a few poorly conceived or errant queries. We'd
> left
> >>>>> that last thread with liking the idea, but there were concerns about
> >> the
> >>>>> complexity of the changes and performance hits.
> >>>>>
> >>>>> Given that we now have gremlin-benchmark, I decided to see what the
> >>>>> performance hit would be for making this change. I took a rough stab
> at
> >>>> it
> >>>>> introducing Thread.interrupted() in all steps where it seemed to make
> >>>> sense
> >>>>> to do so and then ran the benchmark before and after the change.
> >>>>>
> >>>>> https://gist.github.com/spmallette/ed21267f2e7e17bb3fbd5a8d1a568d2b
> >>>>>
> >>>>> I'm not seeing a whole lot of difference between supporting this
> >> feature
> >>>>> and not supporting this feature.  Here's the branch I implemented
> this
> >> in
> >>>>> in case you want to look around:
> >>>>>
> >>>>> https://github.com/apache/incubator-tinkerpop/tree/TINKERPOP-946
> >>>>>
> >>>>> I'm not sure that my changes are completely bulletproof at this
> point,
> >>>> but
> >>>>> I'm reasonably sure that these changes would handle a good majority
> of
> >>>>> calls for thread interruption. I expect to re-target my branch at
> tp31
> >>>>> (currently from master so that i could use the benchmark suite) if
> this
> >>>>> becomes a pull request.
> >>>>>
> >>>>> Any thoughts on the benchmark, the implementation, etc?
> >>>>>
> >>>>
> >>
> >>
>
>

Re: [DISCUSS] Interrupting Traversals

Posted by Marko Rodriguez <ok...@gmail.com>.
Hi,

> Well - I don't think this code is highly specialized. It's good general
> practice to respect Thread.interrupted(). I think you'd find that sentiment
> in just about any java concurrency programming book.  …

Huh, I didn't know this was a "standard pattern." If so, cool. Then that solves that.

>> 1. In OLAP, where there can be multiple threads how does this work?
>> 2. In Giraph/Spark, how does this effect job execution and failure
> responses?
> 
> I ended my initial post in this thread by deleting the last paragraph i
> wrote about OLAP.  :)  I guess there's still some question there as to how
> that will work. If I interrupt the thread that was executing the OLAP
> traversal, it's only going to kill it waiting for the result from Spark or
> wherever.  The traversal will still be executing in the context of spark.
> I assume the way to deal with this is on an implementation specific basis
> where I assume there is a way to cancel a running spark job (or running
> giraph job or whatever). If the Traversal that waits for interrupt could
> signal that cancellation somehow, i guess that would be the way to
> implement that. I don't know enough about the specifics of spark for how
> that would work but it sounds plausible, no?

Yea, I don't know how this would work either as it would be master/slave traversals needing to coordinate. It can be done, but would require the provider implementations to change as they would need to message a "job kill." We could add test cases to GraphComputerTest that ensure that all OLAP engines handle such interrupts correctly. We should also add test cases to VertexStep, PropertyStep, GraphStep, etc. that use interrupt as these are the steps that most providers will implement/extend and we need to ensure they are doing the interruptions correctly.

>> 3. When we move into threaded OLTP, how will this be triggered/effected?
> 
> I'm not sure how that feature will be implemented - so i'm not sure how to
> comment on that.

It would be similar to OLAP, where you have a master traversal and slave/parallel traversals. Again, one thread is interrupted, but what about all the other threads… Its not that its "not possible," just that we will have design around this like for OLAP above.

Marko.


> On Mon, Apr 18, 2016 at 11:28 AM, Marko Rodriguez <ok...@gmail.com>
> wrote:
> 
>> Hi,
>> 
>> I think a problem with this is that it requires every step implementation
>> to have this construct in it -- though many steps simply extend the base
>> FlatMapStep, MapStep, FilterStep, etc. However, not all and thus, this
>> requires all providers to know what this about and write their code
>> accordingly.
>> 
>> A few questions:
>> 
>>        1. In OLAP, where there can be multiple threads how does this work?
>>        2. In Giraph/Spark, how does this effect job execution and failure
>> responses?
>>        3. When we move into threaded OLTP, how will this be
>> triggered/effected?
>>        4. This doesn't work for "infinite loop" lambdas or "hung
>> databases."
>> 
>> I know this is the oldest ticket in the books and a million solutions have
>> been proposed, but it would be nice if this didn't require specialized code
>> in all the steps. We are bound to "forget."
>> 
>> Thanks,
>> Marko.
>> 
>> http://markorodriguez.com
>> 
>> On Apr 18, 2016, at 8:56 AM, Stephen Mallette <sp...@gmail.com>
>> wrote:
>> 
>>> Do you mean:
>>> 
>>> if (Thread.currentThread().isInterrupted()) throw new
>>> TraversalInterruptedException();
>>> 
>>> If so, Thread.interrupted() basically does that under the covers
>>> 
>>> On Mon, Apr 18, 2016 at 10:51 AM, Ted Wilmes <tw...@gmail.com> wrote:
>>> 
>>>> Yeah, looks like benchmark-wise it's a wash, which is good.  I wasn't
>> aware
>>>> of the difference between the static interrupted() and non-static
>>>> isInterrupted().  I was wondering if in this case it should be
>>>> isInterrupted(), but I think how you did it is good because it'll be
>>>> evaluated within the traversal thread regardless.
>>>> 
>>>> --Ted
>>>> 
>>>> On Mon, Apr 18, 2016 at 6:11 AM, Stephen Mallette <spmallette@gmail.com
>>> 
>>>> wrote:
>>>> 
>>>>> A while back, I brought up the issue of being able to interrupt
>>>> traversals:
>>>>> 
>>>>> 
>>>>> 
>>>> 
>> https://pony-poc.apache.org/thread.html/e6477fc9c58d37a5bdcb5938a0eaa285456ad15aa39e16446290e2ff@1444993523@%3Cdev.tinkerpop.apache.org%3E
>>>>> https://issues.apache.org/jira/browse/TINKERPOP-946
>>>>> 
>>>>> As a quick refresher, making Traversal respect Thread.interrupted() is
>>>>> important as you otherwise can quite easily lock up applications like
>>>>> Gremlin Server with a few poorly conceived or errant queries. We'd left
>>>>> that last thread with liking the idea, but there were concerns about
>> the
>>>>> complexity of the changes and performance hits.
>>>>> 
>>>>> Given that we now have gremlin-benchmark, I decided to see what the
>>>>> performance hit would be for making this change. I took a rough stab at
>>>> it
>>>>> introducing Thread.interrupted() in all steps where it seemed to make
>>>> sense
>>>>> to do so and then ran the benchmark before and after the change.
>>>>> 
>>>>> https://gist.github.com/spmallette/ed21267f2e7e17bb3fbd5a8d1a568d2b
>>>>> 
>>>>> I'm not seeing a whole lot of difference between supporting this
>> feature
>>>>> and not supporting this feature.  Here's the branch I implemented this
>> in
>>>>> in case you want to look around:
>>>>> 
>>>>> https://github.com/apache/incubator-tinkerpop/tree/TINKERPOP-946
>>>>> 
>>>>> I'm not sure that my changes are completely bulletproof at this point,
>>>> but
>>>>> I'm reasonably sure that these changes would handle a good majority of
>>>>> calls for thread interruption. I expect to re-target my branch at tp31
>>>>> (currently from master so that i could use the benchmark suite) if this
>>>>> becomes a pull request.
>>>>> 
>>>>> Any thoughts on the benchmark, the implementation, etc?
>>>>> 
>>>> 
>> 
>> 


Re: [DISCUSS] Interrupting Traversals

Posted by Stephen Mallette <sp...@gmail.com>.
Well - I don't think this code is highly specialized. It's good general
practice to respect Thread.interrupted(). I think you'd find that sentiment
in just about any java concurrency programming book.  So, it's not as
though we'd be asking folks to learn something that was new or a difficult
prospect to understand. I can't think of a way to make it a generalized
feature of a Traversal. If you have a while(true) or other long run loop,
checking for thread interruption in there is just a line of code that needs
to be present. Perhaps it is something where "forgetting" can only be
mitigated through code review.

> 1. In OLAP, where there can be multiple threads how does this work?
> 2. In Giraph/Spark, how does this effect job execution and failure
responses?

I ended my initial post in this thread by deleting the last paragraph i
wrote about OLAP.  :)  I guess there's still some question there as to how
that will work. If I interrupt the thread that was executing the OLAP
traversal, it's only going to kill it waiting for the result from Spark or
wherever.  The traversal will still be executing in the context of spark.
I assume the way to deal with this is on an implementation specific basis
where I assume there is a way to cancel a running spark job (or running
giraph job or whatever). If the Traversal that waits for interrupt could
signal that cancellation somehow, i guess that would be the way to
implement that. I don't know enough about the specifics of spark for how
that would work but it sounds plausible, no?

> 3. When we move into threaded OLTP, how will this be triggered/effected?

I'm not sure how that feature will be implemented - so i'm not sure how to
comment on that.

> 4. This doesn't work for "infinite loop" lambdas or "hung databases."

If someone did this:

g.V().sideEffect{while(true){}}

we can't break out of that, but I think that goes back to my initial point
that respecting Thread.interrupted() is a general practice for writing good
code, so if someone wrote a traversal like that, well, they need to accept
the consequences of it.  I guess in the case of a "hung database", it will
be the up to the nature of the "driver" connected to that database to
determine how it responds to interruption.




On Mon, Apr 18, 2016 at 11:28 AM, Marko Rodriguez <ok...@gmail.com>
wrote:

> Hi,
>
> I think a problem with this is that it requires every step implementation
> to have this construct in it -- though many steps simply extend the base
> FlatMapStep, MapStep, FilterStep, etc. However, not all and thus, this
> requires all providers to know what this about and write their code
> accordingly.
>
> A few questions:
>
>         1. In OLAP, where there can be multiple threads how does this work?
>         2. In Giraph/Spark, how does this effect job execution and failure
> responses?
>         3. When we move into threaded OLTP, how will this be
> triggered/effected?
>         4. This doesn't work for "infinite loop" lambdas or "hung
> databases."
>
> I know this is the oldest ticket in the books and a million solutions have
> been proposed, but it would be nice if this didn't require specialized code
> in all the steps. We are bound to "forget."
>
> Thanks,
> Marko.
>
> http://markorodriguez.com
>
> On Apr 18, 2016, at 8:56 AM, Stephen Mallette <sp...@gmail.com>
> wrote:
>
> > Do you mean:
> >
> > if (Thread.currentThread().isInterrupted()) throw new
> > TraversalInterruptedException();
> >
> > If so, Thread.interrupted() basically does that under the covers
> >
> > On Mon, Apr 18, 2016 at 10:51 AM, Ted Wilmes <tw...@gmail.com> wrote:
> >
> >> Yeah, looks like benchmark-wise it's a wash, which is good.  I wasn't
> aware
> >> of the difference between the static interrupted() and non-static
> >> isInterrupted().  I was wondering if in this case it should be
> >> isInterrupted(), but I think how you did it is good because it'll be
> >> evaluated within the traversal thread regardless.
> >>
> >> --Ted
> >>
> >> On Mon, Apr 18, 2016 at 6:11 AM, Stephen Mallette <spmallette@gmail.com
> >
> >> wrote:
> >>
> >>> A while back, I brought up the issue of being able to interrupt
> >> traversals:
> >>>
> >>>
> >>>
> >>
> https://pony-poc.apache.org/thread.html/e6477fc9c58d37a5bdcb5938a0eaa285456ad15aa39e16446290e2ff@1444993523@%3Cdev.tinkerpop.apache.org%3E
> >>> https://issues.apache.org/jira/browse/TINKERPOP-946
> >>>
> >>> As a quick refresher, making Traversal respect Thread.interrupted() is
> >>> important as you otherwise can quite easily lock up applications like
> >>> Gremlin Server with a few poorly conceived or errant queries. We'd left
> >>> that last thread with liking the idea, but there were concerns about
> the
> >>> complexity of the changes and performance hits.
> >>>
> >>> Given that we now have gremlin-benchmark, I decided to see what the
> >>> performance hit would be for making this change. I took a rough stab at
> >> it
> >>> introducing Thread.interrupted() in all steps where it seemed to make
> >> sense
> >>> to do so and then ran the benchmark before and after the change.
> >>>
> >>> https://gist.github.com/spmallette/ed21267f2e7e17bb3fbd5a8d1a568d2b
> >>>
> >>> I'm not seeing a whole lot of difference between supporting this
> feature
> >>> and not supporting this feature.  Here's the branch I implemented this
> in
> >>> in case you want to look around:
> >>>
> >>> https://github.com/apache/incubator-tinkerpop/tree/TINKERPOP-946
> >>>
> >>> I'm not sure that my changes are completely bulletproof at this point,
> >> but
> >>> I'm reasonably sure that these changes would handle a good majority of
> >>> calls for thread interruption. I expect to re-target my branch at tp31
> >>> (currently from master so that i could use the benchmark suite) if this
> >>> becomes a pull request.
> >>>
> >>> Any thoughts on the benchmark, the implementation, etc?
> >>>
> >>
>
>

Re: [DISCUSS] Interrupting Traversals

Posted by Marko Rodriguez <ok...@gmail.com>.
Hi,

I think a problem with this is that it requires every step implementation to have this construct in it -- though many steps simply extend the base FlatMapStep, MapStep, FilterStep, etc. However, not all and thus, this requires all providers to know what this about and write their code accordingly.

A few questions:

	1. In OLAP, where there can be multiple threads how does this work?
	2. In Giraph/Spark, how does this effect job execution and failure responses?
	3. When we move into threaded OLTP, how will this be triggered/effected?
	4. This doesn't work for "infinite loop" lambdas or "hung databases."

I know this is the oldest ticket in the books and a million solutions have been proposed, but it would be nice if this didn't require specialized code in all the steps. We are bound to "forget."

Thanks,
Marko.

http://markorodriguez.com

On Apr 18, 2016, at 8:56 AM, Stephen Mallette <sp...@gmail.com> wrote:

> Do you mean:
> 
> if (Thread.currentThread().isInterrupted()) throw new
> TraversalInterruptedException();
> 
> If so, Thread.interrupted() basically does that under the covers
> 
> On Mon, Apr 18, 2016 at 10:51 AM, Ted Wilmes <tw...@gmail.com> wrote:
> 
>> Yeah, looks like benchmark-wise it's a wash, which is good.  I wasn't aware
>> of the difference between the static interrupted() and non-static
>> isInterrupted().  I was wondering if in this case it should be
>> isInterrupted(), but I think how you did it is good because it'll be
>> evaluated within the traversal thread regardless.
>> 
>> --Ted
>> 
>> On Mon, Apr 18, 2016 at 6:11 AM, Stephen Mallette <sp...@gmail.com>
>> wrote:
>> 
>>> A while back, I brought up the issue of being able to interrupt
>> traversals:
>>> 
>>> 
>>> 
>> https://pony-poc.apache.org/thread.html/e6477fc9c58d37a5bdcb5938a0eaa285456ad15aa39e16446290e2ff@1444993523@%3Cdev.tinkerpop.apache.org%3E
>>> https://issues.apache.org/jira/browse/TINKERPOP-946
>>> 
>>> As a quick refresher, making Traversal respect Thread.interrupted() is
>>> important as you otherwise can quite easily lock up applications like
>>> Gremlin Server with a few poorly conceived or errant queries. We'd left
>>> that last thread with liking the idea, but there were concerns about the
>>> complexity of the changes and performance hits.
>>> 
>>> Given that we now have gremlin-benchmark, I decided to see what the
>>> performance hit would be for making this change. I took a rough stab at
>> it
>>> introducing Thread.interrupted() in all steps where it seemed to make
>> sense
>>> to do so and then ran the benchmark before and after the change.
>>> 
>>> https://gist.github.com/spmallette/ed21267f2e7e17bb3fbd5a8d1a568d2b
>>> 
>>> I'm not seeing a whole lot of difference between supporting this feature
>>> and not supporting this feature.  Here's the branch I implemented this in
>>> in case you want to look around:
>>> 
>>> https://github.com/apache/incubator-tinkerpop/tree/TINKERPOP-946
>>> 
>>> I'm not sure that my changes are completely bulletproof at this point,
>> but
>>> I'm reasonably sure that these changes would handle a good majority of
>>> calls for thread interruption. I expect to re-target my branch at tp31
>>> (currently from master so that i could use the benchmark suite) if this
>>> becomes a pull request.
>>> 
>>> Any thoughts on the benchmark, the implementation, etc?
>>> 
>> 


Re: [DISCUSS] Interrupting Traversals

Posted by Stephen Mallette <sp...@gmail.com>.
Do you mean:

if (Thread.currentThread().isInterrupted()) throw new
TraversalInterruptedException();

If so, Thread.interrupted() basically does that under the covers

On Mon, Apr 18, 2016 at 10:51 AM, Ted Wilmes <tw...@gmail.com> wrote:

> Yeah, looks like benchmark-wise it's a wash, which is good.  I wasn't aware
> of the difference between the static interrupted() and non-static
> isInterrupted().  I was wondering if in this case it should be
> isInterrupted(), but I think how you did it is good because it'll be
> evaluated within the traversal thread regardless.
>
> --Ted
>
> On Mon, Apr 18, 2016 at 6:11 AM, Stephen Mallette <sp...@gmail.com>
> wrote:
>
> > A while back, I brought up the issue of being able to interrupt
> traversals:
> >
> >
> >
> https://pony-poc.apache.org/thread.html/e6477fc9c58d37a5bdcb5938a0eaa285456ad15aa39e16446290e2ff@1444993523@%3Cdev.tinkerpop.apache.org%3E
> > https://issues.apache.org/jira/browse/TINKERPOP-946
> >
> > As a quick refresher, making Traversal respect Thread.interrupted() is
> > important as you otherwise can quite easily lock up applications like
> > Gremlin Server with a few poorly conceived or errant queries. We'd left
> > that last thread with liking the idea, but there were concerns about the
> > complexity of the changes and performance hits.
> >
> > Given that we now have gremlin-benchmark, I decided to see what the
> > performance hit would be for making this change. I took a rough stab at
> it
> > introducing Thread.interrupted() in all steps where it seemed to make
> sense
> > to do so and then ran the benchmark before and after the change.
> >
> > https://gist.github.com/spmallette/ed21267f2e7e17bb3fbd5a8d1a568d2b
> >
> > I'm not seeing a whole lot of difference between supporting this feature
> > and not supporting this feature.  Here's the branch I implemented this in
> > in case you want to look around:
> >
> > https://github.com/apache/incubator-tinkerpop/tree/TINKERPOP-946
> >
> > I'm not sure that my changes are completely bulletproof at this point,
> but
> > I'm reasonably sure that these changes would handle a good majority of
> > calls for thread interruption. I expect to re-target my branch at tp31
> > (currently from master so that i could use the benchmark suite) if this
> > becomes a pull request.
> >
> > Any thoughts on the benchmark, the implementation, etc?
> >
>

Re: [DISCUSS] Interrupting Traversals

Posted by Ted Wilmes <tw...@gmail.com>.
Yeah, looks like benchmark-wise it's a wash, which is good.  I wasn't aware
of the difference between the static interrupted() and non-static
isInterrupted().  I was wondering if in this case it should be
isInterrupted(), but I think how you did it is good because it'll be
evaluated within the traversal thread regardless.

--Ted

On Mon, Apr 18, 2016 at 6:11 AM, Stephen Mallette <sp...@gmail.com>
wrote:

> A while back, I brought up the issue of being able to interrupt traversals:
>
>
> https://pony-poc.apache.org/thread.html/e6477fc9c58d37a5bdcb5938a0eaa285456ad15aa39e16446290e2ff@1444993523@%3Cdev.tinkerpop.apache.org%3E
> https://issues.apache.org/jira/browse/TINKERPOP-946
>
> As a quick refresher, making Traversal respect Thread.interrupted() is
> important as you otherwise can quite easily lock up applications like
> Gremlin Server with a few poorly conceived or errant queries. We'd left
> that last thread with liking the idea, but there were concerns about the
> complexity of the changes and performance hits.
>
> Given that we now have gremlin-benchmark, I decided to see what the
> performance hit would be for making this change. I took a rough stab at it
> introducing Thread.interrupted() in all steps where it seemed to make sense
> to do so and then ran the benchmark before and after the change.
>
> https://gist.github.com/spmallette/ed21267f2e7e17bb3fbd5a8d1a568d2b
>
> I'm not seeing a whole lot of difference between supporting this feature
> and not supporting this feature.  Here's the branch I implemented this in
> in case you want to look around:
>
> https://github.com/apache/incubator-tinkerpop/tree/TINKERPOP-946
>
> I'm not sure that my changes are completely bulletproof at this point, but
> I'm reasonably sure that these changes would handle a good majority of
> calls for thread interruption. I expect to re-target my branch at tp31
> (currently from master so that i could use the benchmark suite) if this
> becomes a pull request.
>
> Any thoughts on the benchmark, the implementation, etc?
>