You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ashic Mahtab <as...@live.com> on 2014/11/05 00:00:42 UTC

Workers not registering after master restart

Hi,I've set up a standalone Spark master (no failover or file recovery specified), and brought up a few worker nodes. All of them registered and were shown in the master web UI. I then stopped and started the master service (the workers were still running). After the master started up, I checked the web UI and none of the workers were registered. I then stopped and started each worker and they registered with the master again.
My question is - is this expected? Is there a timeout after which the worker would have rejoined the master? Or is the only way to ensure workers rejoin is to run master failover  or file based recovery for the master?
Thanks,Ashic. 		 	   		  

RE: Workers not registering after master restart

Posted by Ashic Mahtab <as...@live.com>.
Hi Nan,Cool. Thanks.
Regards,Ashic.
Date: Tue, 4 Nov 2014 18:26:48 -0500
From: zhunanmcgill@gmail.com
To: ashic@live.com
CC: user@spark.apache.org
Subject: Re: Workers not registering after master restart


                
                    Hi, Ashic, 
this is expected for the latest released version
                
However, workers should be able to re-register since 1.2, since this patch https://github.com/apache/spark/pull/2828 was merged
Best,
                
-- Nan Zhu

                 
                On Tuesday, November 4, 2014 at 6:00 PM, Ashic Mahtab wrote:
                
                    


Hi,I've set up a standalone Spark master (no failover or file recovery specified), and brought up a few worker nodes. All of them registered and were shown in the master web UI. I then stopped and started the master service (the workers were still running). After the master started up, I checked the web UI and none of the workers were registered. I then stopped and started each worker and they registered with the master again.
My question is - is this expected? Is there a timeout after which the worker would have rejoined the master? Or is the only way to ensure workers rejoin is to run master failover  or file based recovery for the master?
Thanks,Ashic. 		 	   		  

                 
                 
                 
                 
                
                 
                
                    

                 		 	   		  

Re: Workers not registering after master restart

Posted by Nan Zhu <zh...@gmail.com>.
Hi, Ashic, 

this is expected for the latest released version 

However, workers should be able to re-register since 1.2, since this patch https://github.com/apache/spark/pull/2828 was merged

Best, 

-- 
Nan Zhu


On Tuesday, November 4, 2014 at 6:00 PM, Ashic Mahtab wrote:

> Hi,
> I've set up a standalone Spark master (no failover or file recovery specified), and brought up a few worker nodes. All of them registered and were shown in the master web UI. I then stopped and started the master service (the workers were still running). After the master started up, I checked the web UI and none of the workers were registered. I then stopped and started each worker and they registered with the master again.
> 
> My question is - is this expected? Is there a timeout after which the worker would have rejoined the master? Or is the only way to ensure workers rejoin is to run master failover  or file based recovery for the master?
> 
> Thanks,
> Ashic.
> 
> 
>