You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by map reduced <k3...@gmail.com> on 2017/05/15 22:01:40 UTC

Application dies, Driver keeps on running

Hi,

Setup: Standalone cluster with 32 workers, 1 master
I am running a long running streaming spark job (read from Kafka -> process
-> send to Http endpoint) which should ideally never stop.

I have 2 questions:
1) I have seen some times Driver is still running but application marked as
*Finished*. *Any idea why this happens or any way to debug this?*
Sometimes after running for say 2-3 days (or 4-5 days - random timeframe)
this issue arises, not sure what is causing it. Nothing in logs suggests
failures or exceptions

2) Is there a way for Driver to kill itself instead of keeping on running
without any application to drive?

Thanks,
KP

Re: Application dies, Driver keeps on running

Posted by map reduced <k3...@gmail.com>.
Ah interesting, I stopped spark context and System.exit() from driver with
supervise ON and that seemed to start app if it gets killed.

On Mon, May 15, 2017 at 5:01 PM, map reduced <k3...@gmail.com> wrote:

> Hi,
> I was looking at incorrect place for logs, yes I see some errors in logs:
>
> "Remote RPC client disassociated. Likely due to containers exceeding
> thresholds, or network issues. Check driver logs for WARN messages."
>
> logger="org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend",message="Disconnected
> from Spark cluster! Waiting for reconnection..."
>
> So what is best way to deal with this situation? I would rather have
> driver killed along with it, is there a way to achieve that?
>
>
> On Mon, May 15, 2017 at 3:05 PM, Shixiong(Ryan) Zhu <
> shixiong@databricks.com> wrote:
>
>> So you are using `client` mode. Right? If so, Spark cluster doesn't
>> manage the driver for you. Did you see any error logs in driver?
>>
>> On Mon, May 15, 2017 at 3:01 PM, map reduced <k3...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Setup: Standalone cluster with 32 workers, 1 master
>>> I am running a long running streaming spark job (read from Kafka ->
>>> process -> send to Http endpoint) which should ideally never stop.
>>>
>>> I have 2 questions:
>>> 1) I have seen some times Driver is still running but application marked
>>> as *Finished*. *Any idea why this happens or any way to debug this?*
>>> Sometimes after running for say 2-3 days (or 4-5 days - random
>>> timeframe) this issue arises, not sure what is causing it. Nothing in logs
>>> suggests failures or exceptions
>>>
>>> 2) Is there a way for Driver to kill itself instead of keeping on
>>> running without any application to drive?
>>>
>>> Thanks,
>>> KP
>>>
>>
>>
>

Re: Application dies, Driver keeps on running

Posted by map reduced <k3...@gmail.com>.
Hi,
I was looking at incorrect place for logs, yes I see some errors in logs:

"Remote RPC client disassociated. Likely due to containers exceeding
thresholds, or network issues. Check driver logs for WARN messages."

logger="org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend",message="Disconnected
from Spark cluster! Waiting for reconnection..."

So what is best way to deal with this situation? I would rather have driver
killed along with it, is there a way to achieve that?


On Mon, May 15, 2017 at 3:05 PM, Shixiong(Ryan) Zhu <shixiong@databricks.com
> wrote:

> So you are using `client` mode. Right? If so, Spark cluster doesn't manage
> the driver for you. Did you see any error logs in driver?
>
> On Mon, May 15, 2017 at 3:01 PM, map reduced <k3...@gmail.com> wrote:
>
>> Hi,
>>
>> Setup: Standalone cluster with 32 workers, 1 master
>> I am running a long running streaming spark job (read from Kafka ->
>> process -> send to Http endpoint) which should ideally never stop.
>>
>> I have 2 questions:
>> 1) I have seen some times Driver is still running but application marked
>> as *Finished*. *Any idea why this happens or any way to debug this?*
>> Sometimes after running for say 2-3 days (or 4-5 days - random timeframe)
>> this issue arises, not sure what is causing it. Nothing in logs suggests
>> failures or exceptions
>>
>> 2) Is there a way for Driver to kill itself instead of keeping on running
>> without any application to drive?
>>
>> Thanks,
>> KP
>>
>
>

Re: Application dies, Driver keeps on running

Posted by "Shixiong(Ryan) Zhu" <sh...@databricks.com>.
So you are using `client` mode. Right? If so, Spark cluster doesn't manage
the driver for you. Did you see any error logs in driver?

On Mon, May 15, 2017 at 3:01 PM, map reduced <k3...@gmail.com> wrote:

> Hi,
>
> Setup: Standalone cluster with 32 workers, 1 master
> I am running a long running streaming spark job (read from Kafka ->
> process -> send to Http endpoint) which should ideally never stop.
>
> I have 2 questions:
> 1) I have seen some times Driver is still running but application marked
> as *Finished*. *Any idea why this happens or any way to debug this?*
> Sometimes after running for say 2-3 days (or 4-5 days - random timeframe)
> this issue arises, not sure what is causing it. Nothing in logs suggests
> failures or exceptions
>
> 2) Is there a way for Driver to kill itself instead of keeping on running
> without any application to drive?
>
> Thanks,
> KP
>