You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by Aaron Cordova <aa...@interllective.com> on 2012/01/30 09:33:36 UTC

decommission tablet servers

Is there a way for a user to request that the master decommission one or more tabletservers, causing the master to migrate all the tablets away without triggering a recovery and finally shutting down the process or at least refusing to assign future tablets to it? Of course there would need to be a way to un-decommission a server too in that case. 

HDFS does this for dataNodes through a list of machines in a file and a command to refreshNodes. I think it'd be better to be able to decommission servers through the monitor page, and perhaps the behaviors is: once a server is requested to be decommissioned, the master moves away tablets and the tabletserver process kills itself or the master kills it, and if a tabletserver process is started on that machine thereafter, it joins the cluster like any new server, meaning, we don't refuse that machine rejoining the cluster...

Thoughts?

Aaron

Re: decommission tablet servers

Posted by Aaron Cordova <aa...@interllective.com>.
Or, is it the case that we want, for some reason, the official 'decommission method' to be 'kill'? It seems to me that it'd be smoother to have the master migrate tablets off and avoid recovery ..

On Jan 30, 2012, at 3:35 AM, Aaron Cordova wrote:

> Also - it would be good to be able to do this programmatically, and through the shell.
> 
> I would assign this ticket to myself, should it need to become a ticket, and probably assign it to the 1.5 release. I saw a ticket a while back to make the monitor page more of a controller, and I think a ticket to add security to the monitor page so not just anyone could restart the cluster, etc. ... how's that going?
> 
> This 'decommission tablet server' feature I'm talking about could be implemented and added to the API and shell before the monitor is ready.
> 
> Aaron
> 
> On Jan 30, 2012, at 3:33 AM, Aaron Cordova wrote:
> 
>> Is there a way for a user to request that the master decommission one or more tabletservers, causing the master to migrate all the tablets away without triggering a recovery and finally shutting down the process or at least refusing to assign future tablets to it? Of course there would need to be a way to un-decommission a server too in that case. 
>> 
>> HDFS does this for dataNodes through a list of machines in a file and a command to refreshNodes. I think it'd be better to be able to decommission servers through the monitor page, and perhaps the behaviors is: once a server is requested to be decommissioned, the master moves away tablets and the tabletserver process kills itself or the master kills it, and if a tabletserver process is started on that machine thereafter, it joins the cluster like any new server, meaning, we don't refuse that machine rejoining the cluster...
>> 
>> Thoughts?
>> 
>> Aaron
> 


Re: decommission tablet servers

Posted by Aaron Cordova <aa...@interllective.com>.
Beautiful .. I'll try that. 1.4 is awesome. 

Thanks!

On Jan 30, 2012, at 9:27 AM, John W Vines wrote:

> I believe the ./accumulo admin stop <node> safely brings down all services in a node, forcing flushes for all tablets who utilize that logger. I believe that this behavior exists in 1.4+
> 
> John
> 
> ----- Original Message -----
> | From: "Aaron Cordova" <aa...@interllective.com>
> | To: accumulo-dev@incubator.apache.org
> | Sent: Monday, January 30, 2012 3:35:42 AM
> | Subject: Re: decommission tablet servers
> | Also - it would be good to be able to do this programmatically, and
> | through the shell.
> | 
> | I would assign this ticket to myself, should it need to become a
> | ticket, and probably assign it to the 1.5 release. I saw a ticket a
> | while back to make the monitor page more of a controller, and I think
> | a ticket to add security to the monitor page so not just anyone could
> | restart the cluster, etc. ... how's that going?
> | 
> | This 'decommission tablet server' feature I'm talking about could be
> | implemented and added to the API and shell before the monitor is
> | ready.
> | 
> | Aaron
> | 
> | On Jan 30, 2012, at 3:33 AM, Aaron Cordova wrote:
> | 
> | > Is there a way for a user to request that the master decommission
> | > one or more tabletservers, causing the master to migrate all the
> | > tablets away without triggering a recovery and finally shutting down
> | > the process or at least refusing to assign future tablets to it? Of
> | > course there would need to be a way to un-decommission a server too
> | > in that case.
> | >
> | > HDFS does this for dataNodes through a list of machines in a file
> | > and a command to refreshNodes. I think it'd be better to be able to
> | > decommission servers through the monitor page, and perhaps the
> | > behaviors is: once a server is requested to be decommissioned, the
> | > master moves away tablets and the tabletserver process kills itself
> | > or the master kills it, and if a tabletserver process is started on
> | > that machine thereafter, it joins the cluster like any new server,
> | > meaning, we don't refuse that machine rejoining the cluster...
> | >
> | > Thoughts?
> | >
> | > Aaron


Re: decommission tablet servers

Posted by John W Vines <jo...@ugov.gov>.
I believe the ./accumulo admin stop <node> safely brings down all services in a node, forcing flushes for all tablets who utilize that logger. I believe that this behavior exists in 1.4+

John

----- Original Message -----
| From: "Aaron Cordova" <aa...@interllective.com>
| To: accumulo-dev@incubator.apache.org
| Sent: Monday, January 30, 2012 3:35:42 AM
| Subject: Re: decommission tablet servers
| Also - it would be good to be able to do this programmatically, and
| through the shell.
| 
| I would assign this ticket to myself, should it need to become a
| ticket, and probably assign it to the 1.5 release. I saw a ticket a
| while back to make the monitor page more of a controller, and I think
| a ticket to add security to the monitor page so not just anyone could
| restart the cluster, etc. ... how's that going?
| 
| This 'decommission tablet server' feature I'm talking about could be
| implemented and added to the API and shell before the monitor is
| ready.
| 
| Aaron
| 
| On Jan 30, 2012, at 3:33 AM, Aaron Cordova wrote:
| 
| > Is there a way for a user to request that the master decommission
| > one or more tabletservers, causing the master to migrate all the
| > tablets away without triggering a recovery and finally shutting down
| > the process or at least refusing to assign future tablets to it? Of
| > course there would need to be a way to un-decommission a server too
| > in that case.
| >
| > HDFS does this for dataNodes through a list of machines in a file
| > and a command to refreshNodes. I think it'd be better to be able to
| > decommission servers through the monitor page, and perhaps the
| > behaviors is: once a server is requested to be decommissioned, the
| > master moves away tablets and the tabletserver process kills itself
| > or the master kills it, and if a tabletserver process is started on
| > that machine thereafter, it joins the cluster like any new server,
| > meaning, we don't refuse that machine rejoining the cluster...
| >
| > Thoughts?
| >
| > Aaron

Re: decommission tablet servers

Posted by Aaron Cordova <aa...@interllective.com>.
Also - it would be good to be able to do this programmatically, and through the shell.

I would assign this ticket to myself, should it need to become a ticket, and probably assign it to the 1.5 release. I saw a ticket a while back to make the monitor page more of a controller, and I think a ticket to add security to the monitor page so not just anyone could restart the cluster, etc. ... how's that going?

This 'decommission tablet server' feature I'm talking about could be implemented and added to the API and shell before the monitor is ready.

Aaron

On Jan 30, 2012, at 3:33 AM, Aaron Cordova wrote:

> Is there a way for a user to request that the master decommission one or more tabletservers, causing the master to migrate all the tablets away without triggering a recovery and finally shutting down the process or at least refusing to assign future tablets to it? Of course there would need to be a way to un-decommission a server too in that case. 
> 
> HDFS does this for dataNodes through a list of machines in a file and a command to refreshNodes. I think it'd be better to be able to decommission servers through the monitor page, and perhaps the behaviors is: once a server is requested to be decommissioned, the master moves away tablets and the tabletserver process kills itself or the master kills it, and if a tabletserver process is started on that machine thereafter, it joins the cluster like any new server, meaning, we don't refuse that machine rejoining the cluster...
> 
> Thoughts?
> 
> Aaron