You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by yunfan123 <yu...@foxmail.com> on 2017/09/29 05:30:32 UTC

How flink monitor source stream task(Time Trigger) is running?

In my understanding, flink just use task heartbeat to monitor taskManager is
running.
If source stream (Time Trigger for XXX)thread is crash, it seems flink can't
recovery from this state?



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: How flink monitor source stream task(Time Trigger) is running?

Posted by Aljoscha Krettek <al...@apache.org>.
I think this might not actually be resolved. What YunFan was referring to in the initial mail is the Thread factory that is used for the processing-time service: https://github.com/apache/flink/blob/5af463a9c0ff62603bc342a78dfd5483d834e8a7/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L223 <https://github.com/apache/flink/blob/5af463a9c0ff62603bc342a78dfd5483d834e8a7/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L223>

How likely is it that a ScheduledThreadPoolExecutor simply fails? I don't think we currently have a mechanism that checks whether this service is still alive and would actually start scheduled tasks.

Best,
Aljoscha


> On 4. Oct 2017, at 09:27, Piotr Nowojski <pi...@data-artisans.com> wrote:
> 
> You are welcome :)
> 
> Piotrek
> 
>> On Oct 2, 2017, at 1:19 PM, yunfan123 <yu...@foxmail.com> wrote:
>> 
>> Thank you. 
>> "If SourceFunction.run methods returns without an exception Flink assumes
>> that it has cleanly shutdown and that there were simply no more elements to
>> collect/create by this task. "
>> This sentence solve my confusion.
>> 
>> 
>> 
>> 
>> --
>> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
> 


Re: How flink monitor source stream task(Time Trigger) is running?

Posted by Piotr Nowojski <pi...@data-artisans.com>.
You are welcome :)

Piotrek

> On Oct 2, 2017, at 1:19 PM, yunfan123 <yu...@foxmail.com> wrote:
> 
> Thank you. 
> "If SourceFunction.run methods returns without an exception Flink assumes
> that it has cleanly shutdown and that there were simply no more elements to
> collect/create by this task. "
> This sentence solve my confusion.
> 
> 
> 
> 
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: How flink monitor source stream task(Time Trigger) is running?

Posted by yunfan123 <yu...@foxmail.com>.
Thank you. 
"If SourceFunction.run methods returns without an exception Flink assumes
that it has cleanly shutdown and that there were simply no more elements to
collect/create by this task. "
This sentence solve my confusion.




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: How flink monitor source stream task(Time Trigger) is running?

Posted by Piotr Nowojski <pi...@data-artisans.com>.
I am still not sure what do you mean by “thread crash without throw”.

If SourceFunction.run methods returns without an exception Flink assumes that it has cleanly shutdown and that there were simply no more elements to collect/create by this task. 
If it continue working, without throwing an exception, but it is in some corrupted state, then there is no way for Flink to know that anything has broken. 
If it crash with some segfault, whole TaskManager will crash and that should be detected by Akka.

Piotrek

> On Sep 29, 2017, at 3:05 PM, yunfan123 <yu...@foxmail.com> wrote:
> 
> So my question is if this thread crash without throw any Exception.
> It seems flink can't handle this state.
> 
> 
> 
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: How flink monitor source stream task(Time Trigger) is running?

Posted by yunfan123 <yu...@foxmail.com>.
So my question is if this thread crash without throw any Exception.
It seems flink can't handle this state.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: How flink monitor source stream task(Time Trigger) is running?

Posted by Piotr Nowojski <pi...@data-artisans.com>.
Any exception thrown by your SourceFunction will be caught by Flink and that will mark a task (that was executing this SourceFuntion) as failed.

If you started some custom threads in your SourceFunction, you have to manually propagate their exceptions to the SourceFunction.

Piotrek

> On Sep 29, 2017, at 2:09 PM, yunfan123 <yu...@foxmail.com> wrote:
> 
> My source stream means the funciton implement the
> org.apache.flink.streaming.api.functions.source.SourceFunction.
> My question is how flink know all working thread is alive?
> If one working thread that execute the SourceFunction crash, how flink know
> this happenned? 
> 
> 
> 
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: How flink monitor source stream task(Time Trigger) is running?

Posted by yunfan123 <yu...@foxmail.com>.
My source stream means the funciton implement the
org.apache.flink.streaming.api.functions.source.SourceFunction.
My question is how flink know all working thread is alive?
If one working thread that execute the SourceFunction crash, how flink know
this happenned? 



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: How flink monitor source stream task(Time Trigger) is running?

Posted by Piotr Nowojski <pi...@data-artisans.com>.
We use Akka's DeathWatch mechanism to detect dead components.

TaskManager failure shouldn’t prevent recovering from state (as long as there are enough task slots).

I’m not sure if I understand what you mean by "source stream thread" crash. If is was some error during performing a checkpoint so that it didn’t complete, Flink will not be able to recover from such incomplete checkpoint.

Could you share us the logs with your issue?

Thanks, Piotrek

> On Sep 29, 2017, at 7:30 AM, yunfan123 <yu...@foxmail.com> wrote:
> 
> In my understanding, flink just use task heartbeat to monitor taskManager is
> running.
> If source stream (Time Trigger for XXX)thread is crash, it seems flink can't
> recovery from this state?
> 
> 
> 
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/