You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Stamatis Zampetakis <za...@gmail.com> on 2019/10/08 10:04:17 UTC

[DISCUSS] Make Enumerable operators responsive to interrupts

Hello,

There are many use-cases which require stopping/cancelling the execution of
a query for various reasons. Currently, this can be done by launching the
query in a separate thread and then setting
DataContext.Variable.CANCEL_FLAG [1] accordingly.

However if the tread executing the query gets interrupted through the usual
Thread.interrupt() mechanism the query execution will not stop since the
operators are not responsive to interruption.

How do you feel about making Enumerable operators responsive to interrupts?

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/3f54108b7dcd4d2b89fc42faab145e2f82883791/core/src/main/java/org/apache/calcite/DataContext.java#L87

Re: [DISCUSS] Make Enumerable operators responsive to interrupts

Posted by Danny Chan <yu...@gmail.com>.
I also have this concern, a full build interrupts are awesome for hang on /canceling the job, but it’s really hard to get it completely right, sometimes confuses a lot for some code that should not care about these interrupts signals.

Best,
Danny Chan
在 2019年10月17日 +0800 PM5:47,Vladimir Sitnikov <si...@gmail.com>,写道:
> Roman Elizarov raises valid points re 'interrupts are too hard (or even
> impossible) to get right':
> https://twitter.com/relizarov/status/1184460504238100480
>
> Vladimir

Re: [DISCUSS] Make Enumerable operators responsive to interrupts

Posted by Vladimir Sitnikov <si...@gmail.com>.
Roman Elizarov raises valid points re 'interrupts are too hard (or even
impossible) to get right':
https://twitter.com/relizarov/status/1184460504238100480

Vladimir

Re: [DISCUSS] Make Enumerable operators responsive to interrupts

Posted by Stamatis Zampetakis <za...@gmail.com>.
I agree with both points.

There are projects which do not handle interrupts in the best possible way.
My most recent experience was with H2 [1] where the database breaks
completely if a single thread is interrupted.

Best,
Stamatis

[1] https://github.com/h2database/h2database/issues/227

On Wed, Oct 16, 2019 at 10:10 AM Vladimir Sitnikov <
sitnikov.vladimir@gmail.com> wrote:

> Statamis,
> "cooperative to interrupt" sounds a nice idea, however, I have been bitten
> multiple times by improper interrupt handling (not really with Calcite, but
> with other projects).
>
> In other words, it is good when everybody supports that.
> However, the other libraries might receive unexpected
> "interruptedexception", and they might go off rails.
>
> For example, suppose you are implementing logger.info(...). What do you do
> if you get an exception while logging?
> Do you attempt to log it again? Do you attempt to log to System.err?
> That is puzzling, and I have seen a case when the software performed 3
> attempts, then it stopped logging completely
> because it thought "the logfile is broken".
>
> So:
> 1) It might worth adding "interrupted" checks in the executors
> 2) If Calcite ever uses .interrupt(), then it should be configurable (e.g.
> to avoid cases when .interrupt() kills not that well-prepared code)
>
> Vladimir
>

Re: [DISCUSS] Make Enumerable operators responsive to interrupts

Posted by Vladimir Sitnikov <si...@gmail.com>.
Statamis,
"cooperative to interrupt" sounds a nice idea, however, I have been bitten
multiple times by improper interrupt handling (not really with Calcite, but
with other projects).

In other words, it is good when everybody supports that.
However, the other libraries might receive unexpected
"interruptedexception", and they might go off rails.

For example, suppose you are implementing logger.info(...). What do you do
if you get an exception while logging?
Do you attempt to log it again? Do you attempt to log to System.err?
That is puzzling, and I have seen a case when the software performed 3
attempts, then it stopped logging completely
because it thought "the logfile is broken".

So:
1) It might worth adding "interrupted" checks in the executors
2) If Calcite ever uses .interrupt(), then it should be configurable (e.g.
to avoid cases when .interrupt() kills not that well-prepared code)

Vladimir

Re: [DISCUSS] Make Enumerable operators responsive to interrupts

Posted by Julian Hyde <jh...@apache.org>.
I didn’t realize that it was cooperative. I now see that we need to call Thread.interrupted(), and that no exceptions will be thrown (unless we call a method that declares ’throws InterruptedException’).

It seems doable.

Julian


> On Oct 9, 2019, at 3:55 PM, Stamatis Zampetakis <za...@gmail.com> wrote:
> 
> By using interruption, and not forcefully stopping threads, I think we can
> avoid corrupted data-structures etc.
> 
> "Thread interruption is a cooperative mechanism. The cooperative approach
> is required because we rarely want a task, thread, or service to stop
> immediately, since that could leave shared data structures in an
> inconsistent state. Instead, tasks and services can be coded so that, when
> requested, they clean up any work currently in progress and then
> terminate." [1]
> 
> I don't think it will be easy to implement a pause-resume mechanism. I
> tried a couple of times for other use-cases but not with a big success.
> 
> "There is nothing in the API or language specification that ties
> interruption to any specific cancellation semantics, but in
> practice, using interruption for anything but cancellation is fragile and
> difficult to sustain in larger applications." [1]
> 
> Based on the contract of the Enumerator calling close should release any
> resources so if it is implemented right we shouldn't end up with resource
> leaks.
> 
> In my mind we should check for interrupts only at the slowest part(s) of
> the operation. I am hoping that this is not the initialization or tear down
> phase but remains to be verified.
> 
> Regarding tests, I think it depends how we will implement the cancellation.
> For instance, if we decide to throw specialized exceptions then we can
> enrich them with any additional information that could tell us exactly
> where and when the interruption started. We can also inspect the stack
> trace of running threads and verify that there is no Enumerable code
> running after interruption. Sure is that we cannot rely on the correctness
> of the returned results so far (if there are).
> 
> Best,
> Stamatis
> 
> 
> [1] Java Java Concurrency in Practice by Brian Goetz
> 
> On Tue, Oct 8, 2019 at 8:56 PM Julian Hyde <jh...@apache.org> wrote:
> 
>> Is there a possibility that data structures will be corrupted, if a thread
>> is interrupted in the middle of an operation?
>> 
>> Supposing that we allow resume, is it possible to safely resume after an
>> interrupt?
>> 
>> Supposing that we do not allow resume, and instead call close on the root
>> Enumerable, is it possible to guarantee each Enumerator cleans up after
>> itself?
>> 
>> Is there a period during the lifecycle of a tree of Enumerable objects
>> (e.g. initialization, tear down) where we do not allow interrupts?
>> 
>> How would we test this?
>> 
>> Julian
>> 
>> 
>>> On Oct 8, 2019, at 10:48 AM, Haisheng Yuan <h....@alibaba-inc.com>
>> wrote:
>>> 
>>> Make sense and quite reasonable.
>>> 
>>> - Haisheng
>>> 
>>> ------------------------------------------------------------------
>>> 发件人:Stamatis Zampetakis<za...@gmail.com>
>>> 日 期:2019年10月08日 18:04:17
>>> 收件人:<de...@calcite.apache.org>
>>> 主 题:[DISCUSS] Make Enumerable operators responsive to interrupts
>>> 
>>> Hello,
>>> 
>>> There are many use-cases which require stopping/cancelling the execution
>> of
>>> a query for various reasons. Currently, this can be done by launching the
>>> query in a separate thread and then setting
>>> DataContext.Variable.CANCEL_FLAG [1] accordingly.
>>> 
>>> However if the tread executing the query gets interrupted through the
>> usual
>>> Thread.interrupt() mechanism the query execution will not stop since the
>>> operators are not responsive to interruption.
>>> 
>>> How do you feel about making Enumerable operators responsive to
>> interrupts?
>>> 
>>> Best,
>>> Stamatis
>>> 
>>> [1]
>>> 
>> https://github.com/apache/calcite/blob/3f54108b7dcd4d2b89fc42faab145e2f82883791/core/src/main/java/org/apache/calcite/DataContext.java#L87
>>> 
>> 
>> 


Re: [DISCUSS] Make Enumerable operators responsive to interrupts

Posted by Stamatis Zampetakis <za...@gmail.com>.
By using interruption, and not forcefully stopping threads, I think we can
avoid corrupted data-structures etc.

"Thread interruption is a cooperative mechanism. The cooperative approach
is required because we rarely want a task, thread, or service to stop
immediately, since that could leave shared data structures in an
inconsistent state. Instead, tasks and services can be coded so that, when
requested, they clean up any work currently in progress and then
terminate." [1]

I don't think it will be easy to implement a pause-resume mechanism. I
tried a couple of times for other use-cases but not with a big success.

"There is nothing in the API or language specification that ties
interruption to any specific cancellation semantics, but in
practice, using interruption for anything but cancellation is fragile and
difficult to sustain in larger applications." [1]

Based on the contract of the Enumerator calling close should release any
resources so if it is implemented right we shouldn't end up with resource
leaks.

In my mind we should check for interrupts only at the slowest part(s) of
the operation. I am hoping that this is not the initialization or tear down
phase but remains to be verified.

Regarding tests, I think it depends how we will implement the cancellation.
For instance, if we decide to throw specialized exceptions then we can
enrich them with any additional information that could tell us exactly
where and when the interruption started. We can also inspect the stack
trace of running threads and verify that there is no Enumerable code
running after interruption. Sure is that we cannot rely on the correctness
of the returned results so far (if there are).

Best,
Stamatis


[1] Java Java Concurrency in Practice by Brian Goetz

On Tue, Oct 8, 2019 at 8:56 PM Julian Hyde <jh...@apache.org> wrote:

> Is there a possibility that data structures will be corrupted, if a thread
> is interrupted in the middle of an operation?
>
> Supposing that we allow resume, is it possible to safely resume after an
> interrupt?
>
> Supposing that we do not allow resume, and instead call close on the root
> Enumerable, is it possible to guarantee each Enumerator cleans up after
> itself?
>
> Is there a period during the lifecycle of a tree of Enumerable objects
> (e.g. initialization, tear down) where we do not allow interrupts?
>
> How would we test this?
>
> Julian
>
>
> > On Oct 8, 2019, at 10:48 AM, Haisheng Yuan <h....@alibaba-inc.com>
> wrote:
> >
> > Make sense and quite reasonable.
> >
> > - Haisheng
> >
> > ------------------------------------------------------------------
> > 发件人:Stamatis Zampetakis<za...@gmail.com>
> > 日 期:2019年10月08日 18:04:17
> > 收件人:<de...@calcite.apache.org>
> > 主 题:[DISCUSS] Make Enumerable operators responsive to interrupts
> >
> > Hello,
> >
> > There are many use-cases which require stopping/cancelling the execution
> of
> > a query for various reasons. Currently, this can be done by launching the
> > query in a separate thread and then setting
> > DataContext.Variable.CANCEL_FLAG [1] accordingly.
> >
> > However if the tread executing the query gets interrupted through the
> usual
> > Thread.interrupt() mechanism the query execution will not stop since the
> > operators are not responsive to interruption.
> >
> > How do you feel about making Enumerable operators responsive to
> interrupts?
> >
> > Best,
> > Stamatis
> >
> > [1]
> >
> https://github.com/apache/calcite/blob/3f54108b7dcd4d2b89fc42faab145e2f82883791/core/src/main/java/org/apache/calcite/DataContext.java#L87
> >
>
>

Re: [DISCUSS] Make Enumerable operators responsive to interrupts

Posted by Julian Hyde <jh...@apache.org>.
Is there a possibility that data structures will be corrupted, if a thread is interrupted in the middle of an operation?

Supposing that we allow resume, is it possible to safely resume after an interrupt?

Supposing that we do not allow resume, and instead call close on the root Enumerable, is it possible to guarantee each Enumerator cleans up after itself?

Is there a period during the lifecycle of a tree of Enumerable objects (e.g. initialization, tear down) where we do not allow interrupts?

How would we test this?

Julian
 

> On Oct 8, 2019, at 10:48 AM, Haisheng Yuan <h....@alibaba-inc.com> wrote:
> 
> Make sense and quite reasonable.
> 
> - Haisheng
> 
> ------------------------------------------------------------------
> 发件人:Stamatis Zampetakis<za...@gmail.com>
> 日 期:2019年10月08日 18:04:17
> 收件人:<de...@calcite.apache.org>
> 主 题:[DISCUSS] Make Enumerable operators responsive to interrupts
> 
> Hello,
> 
> There are many use-cases which require stopping/cancelling the execution of
> a query for various reasons. Currently, this can be done by launching the
> query in a separate thread and then setting
> DataContext.Variable.CANCEL_FLAG [1] accordingly.
> 
> However if the tread executing the query gets interrupted through the usual
> Thread.interrupt() mechanism the query execution will not stop since the
> operators are not responsive to interruption.
> 
> How do you feel about making Enumerable operators responsive to interrupts?
> 
> Best,
> Stamatis
> 
> [1]
> https://github.com/apache/calcite/blob/3f54108b7dcd4d2b89fc42faab145e2f82883791/core/src/main/java/org/apache/calcite/DataContext.java#L87
> 


Re: [DISCUSS] Make Enumerable operators responsive to interrupts

Posted by Haisheng Yuan <h....@alibaba-inc.com>.
Make sense and quite reasonable.

- Haisheng

------------------------------------------------------------------
发件人:Stamatis Zampetakis<za...@gmail.com>
日 期:2019年10月08日 18:04:17
收件人:<de...@calcite.apache.org>
主 题:[DISCUSS] Make Enumerable operators responsive to interrupts

Hello,

There are many use-cases which require stopping/cancelling the execution of
a query for various reasons. Currently, this can be done by launching the
query in a separate thread and then setting
DataContext.Variable.CANCEL_FLAG [1] accordingly.

However if the tread executing the query gets interrupted through the usual
Thread.interrupt() mechanism the query execution will not stop since the
operators are not responsive to interruption.

How do you feel about making Enumerable operators responsive to interrupts?

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/3f54108b7dcd4d2b89fc42faab145e2f82883791/core/src/main/java/org/apache/calcite/DataContext.java#L87