You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ashic Mahtab <as...@live.com> on 2014/11/05 00:00:42 UTC
Workers not registering after master restart
Hi,I've set up a standalone Spark master (no failover or file recovery specified), and brought up a few worker nodes. All of them registered and were shown in the master web UI. I then stopped and started the master service (the workers were still running). After the master started up, I checked the web UI and none of the workers were registered. I then stopped and started each worker and they registered with the master again.
My question is - is this expected? Is there a timeout after which the worker would have rejoined the master? Or is the only way to ensure workers rejoin is to run master failover or file based recovery for the master?
Thanks,Ashic.
RE: Workers not registering after master restart
Posted by Ashic Mahtab <as...@live.com>.
Hi Nan,Cool. Thanks.
Regards,Ashic.
Date: Tue, 4 Nov 2014 18:26:48 -0500
From: zhunanmcgill@gmail.com
To: ashic@live.com
CC: user@spark.apache.org
Subject: Re: Workers not registering after master restart
Hi, Ashic,
this is expected for the latest released version
However, workers should be able to re-register since 1.2, since this patch https://github.com/apache/spark/pull/2828 was merged
Best,
-- Nan Zhu
On Tuesday, November 4, 2014 at 6:00 PM, Ashic Mahtab wrote:
Hi,I've set up a standalone Spark master (no failover or file recovery specified), and brought up a few worker nodes. All of them registered and were shown in the master web UI. I then stopped and started the master service (the workers were still running). After the master started up, I checked the web UI and none of the workers were registered. I then stopped and started each worker and they registered with the master again.
My question is - is this expected? Is there a timeout after which the worker would have rejoined the master? Or is the only way to ensure workers rejoin is to run master failover or file based recovery for the master?
Thanks,Ashic.
Re: Workers not registering after master restart
Posted by Nan Zhu <zh...@gmail.com>.
Hi, Ashic,
this is expected for the latest released version
However, workers should be able to re-register since 1.2, since this patch https://github.com/apache/spark/pull/2828 was merged
Best,
--
Nan Zhu
On Tuesday, November 4, 2014 at 6:00 PM, Ashic Mahtab wrote:
> Hi,
> I've set up a standalone Spark master (no failover or file recovery specified), and brought up a few worker nodes. All of them registered and were shown in the master web UI. I then stopped and started the master service (the workers were still running). After the master started up, I checked the web UI and none of the workers were registered. I then stopped and started each worker and they registered with the master again.
>
> My question is - is this expected? Is there a timeout after which the worker would have rejoined the master? Or is the only way to ensure workers rejoin is to run master failover or file based recovery for the master?
>
> Thanks,
> Ashic.
>
>
>