You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@accumulo.apache.org by Shailesh Ligade <SL...@FBI.GOV> on 2021/08/17 12:52:10 UTC

how to decommission tablet server

Hello,

I am using accumulo 1.10 and want to remove few tablet server

I saw in the documentation that I need to run

accumulo admin stop <tserver>:9997

That command comes back quickly, not sure how long, if any I have to wait for before I stop tserver service? When is the time to stop datanode service (running on the same tablet server)? And when to update slaves files (for accumulo and hdfs)?

Any guidelines on this?

Thanks

-S

RE: [EXTERNAL EMAIL] - Re: [External] RE: how to decommission tablet server

Posted by de...@etcoleman.com.

If the admin stop fails to stop the service you will need to either kill or service stop the linux process.

 

The hosts file should be able to be modified either before or after.  If you do it before remember that things like admin start-all, stop-all  will not know about those nodes.  Likewise, if after, then those commands may try actions like start that you’d rather not happen.

 

From: Shailesh Ligade <SL...@FBI.GOV> 
Sent: Wednesday, August 18, 2021 7:53 AM
To: user@accumulo.apache.org
Subject: RE: [EXTERNAL EMAIL] - Re: [External] RE: how to decommission tablet server

 

Thanks

 

So in the reality, if I issue admin stop on the  tserver, (with -f if needed) I don’t need to stop linux service, right?

 

Also, when it is safe to update slaves file? Can I wait till I decommission all my nodes? Or I need to do that after I one node is decommissioned?

 

Appreciated

 

-S

 

 

From: Mike Miller <mmiller@apache.org <ma...@apache.org> > 
Sent: Wednesday, August 18, 2021 7:47 AM
To: user@accumulo.apache.org <ma...@accumulo.apache.org> 
Subject: [EXTERNAL EMAIL] - Re: [External] RE: how to decommission tablet server

 

The admin stop command issues a graceful shutdown to Accumulo for that tserver. There is a force option you could try {"-f", "--force"} that will remove the lock. But these are more graceful than a linux kill -9 command, which you may have to do if the admin command doesn't kill the process entirely.

 

On Wed, Aug 18, 2021 at 7:31 AM Ligade, Shailesh [USA] <Ligade_Shailesh@bah.com <ma...@bah.com> > wrote:

Thank you for good explanation! I really appreciate that.

 

Yes I need to remove the hardware, meaning I need to stop everything on the server (tserver and datanode)

 

One quick question:

 

What is the difference between accumulo admin stop <tserver>:9997 and stopping tserver linux service?

 

When I issue admin stop, I can see, from the monitor, hosted tablets count from the tserver in the question  goes down to 0, however it doesn't stop the tserver process or service.

 

In your steps, you are stopping datanode service first (adding into exclude file and then running refreshNodes and then stop the service), I was thinking to stop accumulo tserver and let it handle hosted tablets first, before touching datanode, will there be any difference? Just trying to understand how the relationship between accumulo and hadoop is.

 

Thank you!

 

-S

  _____  

From: dev1@etcoleman.com <ma...@etcoleman.com>  <dev1@etcoleman.com <ma...@etcoleman.com> >
Sent: Tuesday, August 17, 2021 2:39 PM
To: user@accumulo.apache.org <ma...@accumulo.apache.org>  <user@accumulo.apache.org <ma...@accumulo.apache.org> >
Subject: [External] RE: how to decommission tablet server 

 

Maybe you could clarify.  Decommissioning tablet servers and hdfs replication are separate and distinct issues.  Accumulo will generally be unaware of hdfs replication and table assignment does not change the hdfs replication.  You can set the replication factor for a tablet – but that is used on writes to hdfs – Accumulo will assume that on any successful write, on return hdfs  is managing the details.

 

When a tablet is assigned / migrated, the underlying files in hdfs are not changed – the file references are reassigned in a metadata operation, but the files themselves are not modified.  They will maintain whatever replication factor that was assigned and whatever the namenode decides.

 

If you are removing servers that have both data nodes and tserver processes running: 

 

If you stop / kill the tserver, the tablets assigned to that server will be reassigned rather quickly.  It is only an metadata update.  The exact timing will depend on your ZooKeeper time-out setting, but the “dead” tserver should be detected and reassigned in short order. The reassignment may cause some churn of assignments if the cluster becomes un-balanced.   The manager (master) will select tablets from tservers that are over-subscribed and then assign them to tservers that have fewer tablets – you can monitor the manager (master) debug log to see the migration progress.  If you want to be gentile, stop a tserver, wait for the number of unassigned tables to hit zero and migration to settle and then repeat.

 

If you want to stop the data nodes, you can do that independently of Accumulo – just follow the Hadoop data node decommission process.  Hadoop will move the data blocks assigned to the data node so that it is “safe” to then stop the data node process.  This is independent of Accumulo and Accumulo will not be aware that the blocks are moving.  If you are running compactions, Accumulo may try to write blocks locally, but if the data node is rejecting new block assignments (which I rather assume that it would when in decommission mode) then Accumulo still would not care.  If somehow new blocks where written it may just delay the Hadoop data node decommissioning.

 

If you are running ingest while killing tservers – things should mostly work – there may be ingest failures, but normally things would get retried and the subsequent effort should succeed – the issue may be that if by bad luck the work keeps getting assigned to tservers that are then killed, you could end up exceeding the number of retries and the ingest would fail out right.  If you can pause ingest, then this limits that chance.  If you can monitor your ingest and know when an ingest failed you could just reschedule the ingest (for bulk import)  If you are doing continuous ingest, it may be harder to determine if a specific ingest fails, so you may need to select an appropriate range for replay.  Overall it may mostly work – it will depend on your processes and your tolerance for any particular data loss on an ingest.

 

The modest approach (if you can accept transient errors):

 

1 Start the data node decommission process.

2 Pause ingest and cancel any running user compactions.

3 Stop a tserver and wait for unassigned tablets to go back to 0.  Wait for the tablet migration (if any) to quiet down. 

4 Repeat 3 until all tserver processes have been stopped on the nodes you are removing.

5 Restart ingest – rerun any user compactions if you stopped any.

6 Wait for the hdfs decommission process to finish moving / replicating blocks.

7 stop the data node process.

8 do what you want with the node.

 

You do not need to schedule down time – if you can accept transient errors – say that a client scan is running and that tserver is stopped – the client may receive an error for the scan.  If the scan is resubmitted and the tablet has been reassigned it should work – it may pause for the reassignment and / or timeout if the assignment takes some time.   You are basically playing a number game here – the number of tablets, the number of unassigned tablets, the odds that a scan would be using a particular tablet for the duration that it is unavailable.  It’s not guaranteed that it will fail, its just that there is a greater than 0 chance that it could – if that is unacceptable then:

 

1 Stop ingest – wait for all to finish or mark which ones will need to be rescheduled

2 Stop Accumulo

3 Remove the tservers from the servers list

4 Start Accumulo without starting the decommissioned tserver nodes.

 

Do what you want with the data node decommissioning.

 

The later approach removes possible transient issues.  It is up to you to determine your tolerance for possible transient issues for the duration that tservers are being stopped vs a complete outage for the duration that Accumulo is down.  If it is a large cluster and just a few tservers, the odds of a specific tablet being off line for a short duration may be very low.  If it is a small cluster or the percentage of tservers that you are stopping is large then the odds increase, but the issues will still be transient.  You need to decide which is acceptable to you and your circumstances.  

 

From: Shailesh Ligade <SLIGADE@FBI.GOV <ma...@FBI.GOV> > 
Sent: Tuesday, August 17, 2021 11:26 AM
To: user@accumulo.apache.org <ma...@accumulo.apache.org> 
Subject: RE: how to decommission tablet server

 

It will be helpful to know that when you are decommissioning tablets (one at a time for underlying hdfs to replicate), do we need accumulo downtime? Can accumulo be ingesting while we are decommissioning tablets?

 

Thanks

 

-S

 

From: Shailesh Ligade <SLIGADE@FBI.GOV <ma...@FBI.GOV> > 
Sent: Tuesday, August 17, 2021 8:52 AM
To: user@accumulo.apache.org <ma...@accumulo.apache.org> 
Subject: [EXTERNAL EMAIL] - how to decommission tablet server

 

Hello,

 

I am using accumulo 1.10 and want to remove few tablet server

 

I saw in the documentation that I need to run

 

accumulo admin stop <tserver>:9997

 

That command comes back quickly, not sure how long, if any I have to wait for before I stop tserver service? When is the time to stop datanode service (running on the same tablet server)? And when to update slaves files (for accumulo and hdfs)?

 

Any guidelines on this?

 

Thanks

 

-S

RE: [EXTERNAL EMAIL] - Re: [External] RE: how to decommission tablet server

Posted by Shailesh Ligade <SL...@FBI.GOV>.

Thanks

So in the reality, if I issue admin stop on the  tserver, (with -f if needed) I don’t need to stop linux service, right?

Also, when it is safe to update slaves file? Can I wait till I decommission all my nodes? Or I need to do that after I one node is decommissioned?

Appreciated

-S


From: Mike Miller <mm...@apache.org>
Sent: Wednesday, August 18, 2021 7:47 AM
To: user@accumulo.apache.org
Subject: [EXTERNAL EMAIL] - Re: [External] RE: how to decommission tablet server

The admin stop command issues a graceful shutdown to Accumulo for that tserver. There is a force option you could try {"-f", "--force"} that will remove the lock. But these are more graceful than a linux kill -9 command, which you may have to do if the admin command doesn't kill the process entirely.

On Wed, Aug 18, 2021 at 7:31 AM Ligade, Shailesh [USA] <Li...@bah.com>> wrote:
Thank you for good explanation! I really appreciate that.

Yes I need to remove the hardware, meaning I need to stop everything on the server (tserver and datanode)

One quick question:

What is the difference between accumulo admin stop <tserver>:9997 and stopping tserver linux service?

When I issue admin stop, I can see, from the monitor, hosted tablets count from the tserver in the question  goes down to 0, however it doesn't stop the tserver process or service.

In your steps, you are stopping datanode service first (adding into exclude file and then running refreshNodes and then stop the service), I was thinking to stop accumulo tserver and let it handle hosted tablets first, before touching datanode, will there be any difference? Just trying to understand how the relationship between accumulo and hadoop is.

Thank you!

-S
________________________________
From: dev1@etcoleman.com<ma...@etcoleman.com> <de...@etcoleman.com>>
Sent: Tuesday, August 17, 2021 2:39 PM
To: user@accumulo.apache.org<ma...@accumulo.apache.org> <us...@accumulo.apache.org>>
Subject: [External] RE: how to decommission tablet server


Maybe you could clarify.  Decommissioning tablet servers and hdfs replication are separate and distinct issues.  Accumulo will generally be unaware of hdfs replication and table assignment does not change the hdfs replication.  You can set the replication factor for a tablet – but that is used on writes to hdfs – Accumulo will assume that on any successful write, on return hdfs  is managing the details.



When a tablet is assigned / migrated, the underlying files in hdfs are not changed – the file references are reassigned in a metadata operation, but the files themselves are not modified.  They will maintain whatever replication factor that was assigned and whatever the namenode decides.



If you are removing servers that have both data nodes and tserver processes running:



If you stop / kill the tserver, the tablets assigned to that server will be reassigned rather quickly.  It is only an metadata update.  The exact timing will depend on your ZooKeeper time-out setting, but the “dead” tserver should be detected and reassigned in short order. The reassignment may cause some churn of assignments if the cluster becomes un-balanced.   The manager (master) will select tablets from tservers that are over-subscribed and then assign them to tservers that have fewer tablets – you can monitor the manager (master) debug log to see the migration progress.  If you want to be gentile, stop a tserver, wait for the number of unassigned tables to hit zero and migration to settle and then repeat.



If you want to stop the data nodes, you can do that independently of Accumulo – just follow the Hadoop data node decommission process.  Hadoop will move the data blocks assigned to the data node so that it is “safe” to then stop the data node process.  This is independent of Accumulo and Accumulo will not be aware that the blocks are moving.  If you are running compactions, Accumulo may try to write blocks locally, but if the data node is rejecting new block assignments (which I rather assume that it would when in decommission mode) then Accumulo still would not care.  If somehow new blocks where written it may just delay the Hadoop data node decommissioning.



If you are running ingest while killing tservers – things should mostly work – there may be ingest failures, but normally things would get retried and the subsequent effort should succeed – the issue may be that if by bad luck the work keeps getting assigned to tservers that are then killed, you could end up exceeding the number of retries and the ingest would fail out right.  If you can pause ingest, then this limits that chance.  If you can monitor your ingest and know when an ingest failed you could just reschedule the ingest (for bulk import)  If you are doing continuous ingest, it may be harder to determine if a specific ingest fails, so you may need to select an appropriate range for replay.  Overall it may mostly work – it will depend on your processes and your tolerance for any particular data loss on an ingest.



The modest approach (if you can accept transient errors):



1 Start the data node decommission process.

2 Pause ingest and cancel any running user compactions.

3 Stop a tserver and wait for unassigned tablets to go back to 0.  Wait for the tablet migration (if any) to quiet down.

4 Repeat 3 until all tserver processes have been stopped on the nodes you are removing.

5 Restart ingest – rerun any user compactions if you stopped any.

6 Wait for the hdfs decommission process to finish moving / replicating blocks.

7 stop the data node process.

8 do what you want with the node.



You do not need to schedule down time – if you can accept transient errors – say that a client scan is running and that tserver is stopped – the client may receive an error for the scan.  If the scan is resubmitted and the tablet has been reassigned it should work – it may pause for the reassignment and / or timeout if the assignment takes some time.   You are basically playing a number game here – the number of tablets, the number of unassigned tablets, the odds that a scan would be using a particular tablet for the duration that it is unavailable.  It’s not guaranteed that it will fail, its just that there is a greater than 0 chance that it could – if that is unacceptable then:



1 Stop ingest – wait for all to finish or mark which ones will need to be rescheduled

2 Stop Accumulo

3 Remove the tservers from the servers list

4 Start Accumulo without starting the decommissioned tserver nodes.



Do what you want with the data node decommissioning.



The later approach removes possible transient issues.  It is up to you to determine your tolerance for possible transient issues for the duration that tservers are being stopped vs a complete outage for the duration that Accumulo is down.  If it is a large cluster and just a few tservers, the odds of a specific tablet being off line for a short duration may be very low.  If it is a small cluster or the percentage of tservers that you are stopping is large then the odds increase, but the issues will still be transient.  You need to decide which is acceptable to you and your circumstances.



From: Shailesh Ligade <SL...@FBI.GOV>>
Sent: Tuesday, August 17, 2021 11:26 AM
To: user@accumulo.apache.org<ma...@accumulo.apache.org>
Subject: RE: how to decommission tablet server



It will be helpful to know that when you are decommissioning tablets (one at a time for underlying hdfs to replicate), do we need accumulo downtime? Can accumulo be ingesting while we are decommissioning tablets?



Thanks



-S



From: Shailesh Ligade <SL...@FBI.GOV>>
Sent: Tuesday, August 17, 2021 8:52 AM
To: user@accumulo.apache.org<ma...@accumulo.apache.org>
Subject: [EXTERNAL EMAIL] - how to decommission tablet server



Hello,



I am using accumulo 1.10 and want to remove few tablet server



I saw in the documentation that I need to run



accumulo admin stop <tserver>:9997



That command comes back quickly, not sure how long, if any I have to wait for before I stop tserver service? When is the time to stop datanode service (running on the same tablet server)? And when to update slaves files (for accumulo and hdfs)?



Any guidelines on this?



Thanks



-S

Re: [External] RE: how to decommission tablet server

Posted by Mike Miller <mm...@apache.org>.

The admin stop command issues a graceful shutdown to Accumulo for that
tserver. There is a force option you could try {"-f", "--force"} that will
remove the lock. But these are more graceful than a linux kill -9 command,
which you may have to do if the admin command doesn't kill the process
entirely.

On Wed, Aug 18, 2021 at 7:31 AM Ligade, Shailesh [USA] <
Ligade_Shailesh@bah.com> wrote:

> Thank you for good explanation! I really appreciate that.
>
> Yes I need to remove the hardware, meaning I need to stop everything on
> the server (tserver and datanode)
>
> One quick question:
>
> What is the difference between accumulo admin stop <tserver>:9997 and
> stopping tserver linux service?
>
> When I issue admin stop, I can see, from the monitor, hosted tablets count
> from the tserver in the question  goes down to 0, however it doesn't stop
> the tserver process or service.
>
> In your steps, you are stopping datanode service first (adding into
> exclude file and then running refreshNodes and then stop the service), I
> was thinking to stop accumulo tserver and let it handle hosted tablets
> first, before touching datanode, will there be any difference? Just trying
> to understand how the relationship between accumulo and hadoop is.
>
> Thank you!
>
> -S
> ------------------------------
> *From:* dev1@etcoleman.com <de...@etcoleman.com>
> *Sent:* Tuesday, August 17, 2021 2:39 PM
> *To:* user@accumulo.apache.org <us...@accumulo.apache.org>
> *Subject:* [External] RE: how to decommission tablet server
>
>
> Maybe you could clarify.  Decommissioning tablet servers and hdfs
> replication are separate and distinct issues.  Accumulo will generally be
> unaware of hdfs replication and table assignment does not change the hdfs
> replication.  You can set the replication factor for a tablet – but that is
> used on writes to hdfs – Accumulo will assume that on any successful write,
> on return hdfs  is managing the details.
>
>
>
> When a tablet is assigned / migrated, the underlying files in hdfs are not
> changed – the file references are reassigned in a metadata operation, but
> the files themselves are not modified.  They will maintain whatever
> replication factor that was assigned and whatever the namenode decides.
>
>
>
> If you are removing servers that have both data nodes and tserver
> processes running:
>
>
>
> If you stop / kill the tserver, the tablets assigned to that server will
> be reassigned rather quickly.  It is only an metadata update.  The exact
> timing will depend on your ZooKeeper time-out setting, but the “dead”
> tserver should be detected and reassigned in short order. The reassignment
> may cause some churn of assignments if the cluster becomes un-balanced.
>  The manager (master) will select tablets from tservers that are
> over-subscribed and then assign them to tservers that have fewer tablets –
> you can monitor the manager (master) debug log to see the migration
> progress.  If you want to be gentile, stop a tserver, wait for the number
> of unassigned tables to hit zero and migration to settle and then repeat.
>
>
>
> If you want to stop the data nodes, you can do that independently of
> Accumulo – just follow the Hadoop data node decommission process.  Hadoop
> will move the data blocks assigned to the data node so that it is “safe” to
> then stop the data node process.  This is independent of Accumulo and
> Accumulo will not be aware that the blocks are moving.  If you are running
> compactions, Accumulo may try to write blocks locally, but if the data node
> is rejecting new block assignments (which I rather assume that it would
> when in decommission mode) then Accumulo still would not care.  If somehow
> new blocks where written it may just delay the Hadoop data node
> decommissioning.
>
>
>
> If you are running ingest while killing tservers – things should mostly
> work – there may be ingest failures, but normally things would get retried
> and the subsequent effort should succeed – the issue may be that if by bad
> luck the work keeps getting assigned to tservers that are then killed, you
> could end up exceeding the number of retries and the ingest would fail out
> right.  If you can pause ingest, then this limits that chance.  If you can
> monitor your ingest and know when an ingest failed you could just
> reschedule the ingest (for bulk import)  If you are doing continuous
> ingest, it may be harder to determine if a specific ingest fails, so you
> may need to select an appropriate range for replay.  Overall it may mostly
> work – it will depend on your processes and your tolerance for any
> particular data loss on an ingest.
>
>
>
> The modest approach (if you can accept transient errors):
>
>
>
> 1 Start the data node decommission process.
>
> 2 Pause ingest and cancel any running user compactions.
>
> 3 Stop a tserver and wait for unassigned tablets to go back to 0.  Wait
> for the tablet migration (if any) to quiet down.
>
> 4 Repeat 3 until all tserver processes have been stopped on the nodes you
> are removing.
>
> 5 Restart ingest – rerun any user compactions if you stopped any.
>
> 6 Wait for the hdfs decommission process to finish moving / replicating
> blocks.
>
> 7 stop the data node process.
>
> 8 do what you want with the node.
>
>
>
> You do not need to schedule down time – if you can accept transient errors
> – say that a client scan is running and that tserver is stopped – the
> client may receive an error for the scan.  If the scan is resubmitted and
> the tablet has been reassigned it should work – it may pause for the
> reassignment and / or timeout if the assignment takes some time.   You are
> basically playing a number game here – the number of tablets, the number of
> unassigned tablets, the odds that a scan would be using a particular tablet
> for the duration that it is unavailable.  It’s not guaranteed that it will
> fail, its just that there is a greater than 0 chance that it could – if
> that is unacceptable then:
>
>
>
> 1 Stop ingest – wait for all to finish or mark which ones will need to be
> rescheduled
>
> 2 Stop Accumulo
>
> 3 Remove the tservers from the servers list
>
> 4 Start Accumulo without starting the decommissioned tserver nodes.
>
>
>
> Do what you want with the data node decommissioning.
>
>
>
> The later approach removes possible transient issues.  It is up to you to
> determine your tolerance for possible transient issues for the duration
> that tservers are being stopped vs a complete outage for the duration that
> Accumulo is down.  If it is a large cluster and just a few tservers, the
> odds of a specific tablet being off line for a short duration may be very
> low.  If it is a small cluster or the percentage of tservers that you are
> stopping is large then the odds increase, but the issues will still be
> transient.  You need to decide which is acceptable to you and your
> circumstances.
>
>
>
> *From:* Shailesh Ligade <SL...@FBI.GOV>
> *Sent:* Tuesday, August 17, 2021 11:26 AM
> *To:* user@accumulo.apache.org
> *Subject:* RE: how to decommission tablet server
>
>
>
> It will be helpful to know that when you are decommissioning tablets (one
> at a time for underlying hdfs to replicate), do we need accumulo downtime?
> Can accumulo be ingesting while we are decommissioning tablets?
>
>
>
> Thanks
>
>
>
> -S
>
>
>
> *From:* Shailesh Ligade <SL...@FBI.GOV>
> *Sent:* Tuesday, August 17, 2021 8:52 AM
> *To:* user@accumulo.apache.org
> *Subject:* [EXTERNAL EMAIL] - how to decommission tablet server
>
>
>
> Hello,
>
>
>
> I am using accumulo 1.10 and want to remove few tablet server
>
>
>
> I saw in the documentation that I need to run
>
>
>
> accumulo admin stop <tserver>:9997
>
>
>
> That command comes back quickly, not sure how long, if any I have to wait
> for before I stop tserver service? When is the time to stop datanode
> service (running on the same tablet server)? And when to update slaves
> files (for accumulo and hdfs)?
>
>
>
> Any guidelines on this?
>
>
>
> Thanks
>
>
>
> -S
>
>
>
>
>
>
>

RE: [External] RE: how to decommission tablet server

Posted by de...@etcoleman.com.

The accumulo admin service should attempt to unload the tablets a little
more gracefully than just stopping the service.   I don't know why it's not
stopping the service. I do know that some had issues when the services were
controlled with systemd.  Stopping the service without the admin command
(either kill [pid] or service stop] is not catastrophic - it's basically
just like the node / service failed which Accumulo should handle.  One note,
if you can id the servers that may be hosting the metadata and root tablets,
and if you can, stop them last - that reduces the chance that they would
migrate to a server that is about to be decommissioned.

 

Sorry if I indicated that you should stop the data node first - you can
start the Hadoop decommission first and then when the data node migration
has completed stopping the data node process.  You can stop Accumulo nodes
in parallel with that decommissioning.   An accumulo process running on a
node and a data node process running on that same node are independent.
Accumulo uses Hadoop / namenode to persist files.  The namenode coordinates
with the data nodes where those blocks are stored.

 

You likely can edit the hosts files either before or after - if you elect to
do it after, make sure that you disable the ability to restart the teserver
once stopped on a decommissioning node - say by running accumulo start-all
or through an auto-service restart - that would end up restarting the
services.

 

From: Ligade, Shailesh [USA] <Li...@bah.com> 
Sent: Wednesday, August 18, 2021 7:31 AM
To: user@accumulo.apache.org
Subject: Re: [External] RE: how to decommission tablet server

 

Thank you for good explanation! I really appreciate that.

 

Yes I need to remove the hardware, meaning I need to stop everything on the
server (tserver and datanode)

 

One quick question:

 

What is the difference between accumulo admin stop <tserver>:9997 and
stopping tserver linux service?

 

When I issue admin stop, I can see, from the monitor, hosted tablets count
from the tserver in the question  goes down to 0, however it doesn't stop
the tserver process or service.

 

In your steps, you are stopping datanode service first (adding into exclude
file and then running refreshNodes and then stop the service), I was
thinking to stop accumulo tserver and let it handle hosted tablets first,
before touching datanode, will there be any difference? Just trying to
understand how the relationship between accumulo and hadoop is.

 

Thank you!

 

-S

  _____  

From: dev1@etcoleman.com <ma...@etcoleman.com>  <dev1@etcoleman.com
<ma...@etcoleman.com> >
Sent: Tuesday, August 17, 2021 2:39 PM
To: user@accumulo.apache.org <ma...@accumulo.apache.org>
<user@accumulo.apache.org <ma...@accumulo.apache.org> >
Subject: [External] RE: how to decommission tablet server 

 

Maybe you could clarify.  Decommissioning tablet servers and hdfs
replication are separate and distinct issues.  Accumulo will generally be
unaware of hdfs replication and table assignment does not change the hdfs
replication.  You can set the replication factor for a tablet - but that is
used on writes to hdfs - Accumulo will assume that on any successful write,
on return hdfs  is managing the details.

 

When a tablet is assigned / migrated, the underlying files in hdfs are not
changed - the file references are reassigned in a metadata operation, but
the files themselves are not modified.  They will maintain whatever
replication factor that was assigned and whatever the namenode decides.

 

If you are removing servers that have both data nodes and tserver processes
running: 

 

If you stop / kill the tserver, the tablets assigned to that server will be
reassigned rather quickly.  It is only an metadata update.  The exact timing
will depend on your ZooKeeper time-out setting, but the "dead" tserver
should be detected and reassigned in short order. The reassignment may cause
some churn of assignments if the cluster becomes un-balanced.   The manager
(master) will select tablets from tservers that are over-subscribed and then
assign them to tservers that have fewer tablets - you can monitor the
manager (master) debug log to see the migration progress.  If you want to be
gentile, stop a tserver, wait for the number of unassigned tables to hit
zero and migration to settle and then repeat.

 

If you want to stop the data nodes, you can do that independently of
Accumulo - just follow the Hadoop data node decommission process.  Hadoop
will move the data blocks assigned to the data node so that it is "safe" to
then stop the data node process.  This is independent of Accumulo and
Accumulo will not be aware that the blocks are moving.  If you are running
compactions, Accumulo may try to write blocks locally, but if the data node
is rejecting new block assignments (which I rather assume that it would when
in decommission mode) then Accumulo still would not care.  If somehow new
blocks where written it may just delay the Hadoop data node decommissioning.

 

If you are running ingest while killing tservers - things should mostly work
- there may be ingest failures, but normally things would get retried and
the subsequent effort should succeed - the issue may be that if by bad luck
the work keeps getting assigned to tservers that are then killed, you could
end up exceeding the number of retries and the ingest would fail out right.
If you can pause ingest, then this limits that chance.  If you can monitor
your ingest and know when an ingest failed you could just reschedule the
ingest (for bulk import)  If you are doing continuous ingest, it may be
harder to determine if a specific ingest fails, so you may need to select an
appropriate range for replay.  Overall it may mostly work - it will depend
on your processes and your tolerance for any particular data loss on an
ingest.

 

The modest approach (if you can accept transient errors):

 

1 Start the data node decommission process.

2 Pause ingest and cancel any running user compactions.

3 Stop a tserver and wait for unassigned tablets to go back to 0.  Wait for
the tablet migration (if any) to quiet down. 

4 Repeat 3 until all tserver processes have been stopped on the nodes you
are removing.

5 Restart ingest - rerun any user compactions if you stopped any.

6 Wait for the hdfs decommission process to finish moving / replicating
blocks.

7 stop the data node process.

8 do what you want with the node.

 

You do not need to schedule down time - if you can accept transient errors -
say that a client scan is running and that tserver is stopped - the client
may receive an error for the scan.  If the scan is resubmitted and the
tablet has been reassigned it should work - it may pause for the
reassignment and / or timeout if the assignment takes some time.   You are
basically playing a number game here - the number of tablets, the number of
unassigned tablets, the odds that a scan would be using a particular tablet
for the duration that it is unavailable.  It's not guaranteed that it will
fail, its just that there is a greater than 0 chance that it could - if that
is unacceptable then:

 

1 Stop ingest - wait for all to finish or mark which ones will need to be
rescheduled

2 Stop Accumulo

3 Remove the tservers from the servers list

4 Start Accumulo without starting the decommissioned tserver nodes.

 

Do what you want with the data node decommissioning.

 

The later approach removes possible transient issues.  It is up to you to
determine your tolerance for possible transient issues for the duration that
tservers are being stopped vs a complete outage for the duration that
Accumulo is down.  If it is a large cluster and just a few tservers, the
odds of a specific tablet being off line for a short duration may be very
low.  If it is a small cluster or the percentage of tservers that you are
stopping is large then the odds increase, but the issues will still be
transient.  You need to decide which is acceptable to you and your
circumstances.  

 

From: Shailesh Ligade <SLIGADE@FBI.GOV <ma...@FBI.GOV> > 
Sent: Tuesday, August 17, 2021 11:26 AM
To: user@accumulo.apache.org <ma...@accumulo.apache.org> 
Subject: RE: how to decommission tablet server

 

It will be helpful to know that when you are decommissioning tablets (one at
a time for underlying hdfs to replicate), do we need accumulo downtime? Can
accumulo be ingesting while we are decommissioning tablets?

 

Thanks

 

-S

 

From: Shailesh Ligade <SLIGADE@FBI.GOV <ma...@FBI.GOV> > 
Sent: Tuesday, August 17, 2021 8:52 AM
To: user@accumulo.apache.org <ma...@accumulo.apache.org> 
Subject: [EXTERNAL EMAIL] - how to decommission tablet server

 

Hello,

 

I am using accumulo 1.10 and want to remove few tablet server

 

I saw in the documentation that I need to run

 

accumulo admin stop <tserver>:9997

 

That command comes back quickly, not sure how long, if any I have to wait
for before I stop tserver service? When is the time to stop datanode service
(running on the same tablet server)? And when to update slaves files (for
accumulo and hdfs)?

 

Any guidelines on this?

 

Thanks

 

-S

Re: [External] RE: how to decommission tablet server

Posted by "Ligade, Shailesh [USA]" <Li...@bah.com>.

Thank you for good explanation! I really appreciate that.

Yes I need to remove the hardware, meaning I need to stop everything on the server (tserver and datanode)

One quick question:

What is the difference between accumulo admin stop <tserver>:9997 and stopping tserver linux service?

When I issue admin stop, I can see, from the monitor, hosted tablets count from the tserver in the question  goes down to 0, however it doesn't stop the tserver process or service.

In your steps, you are stopping datanode service first (adding into exclude file and then running refreshNodes and then stop the service), I was thinking to stop accumulo tserver and let it handle hosted tablets first, before touching datanode, will there be any difference? Just trying to understand how the relationship between accumulo and hadoop is.

Thank you!

-S
________________________________
From: dev1@etcoleman.com <de...@etcoleman.com>
Sent: Tuesday, August 17, 2021 2:39 PM
To: user@accumulo.apache.org <us...@accumulo.apache.org>
Subject: [External] RE: how to decommission tablet server


Maybe you could clarify.  Decommissioning tablet servers and hdfs replication are separate and distinct issues.  Accumulo will generally be unaware of hdfs replication and table assignment does not change the hdfs replication.  You can set the replication factor for a tablet – but that is used on writes to hdfs – Accumulo will assume that on any successful write, on return hdfs  is managing the details.



When a tablet is assigned / migrated, the underlying files in hdfs are not changed – the file references are reassigned in a metadata operation, but the files themselves are not modified.  They will maintain whatever replication factor that was assigned and whatever the namenode decides.



If you are removing servers that have both data nodes and tserver processes running:



If you stop / kill the tserver, the tablets assigned to that server will be reassigned rather quickly.  It is only an metadata update.  The exact timing will depend on your ZooKeeper time-out setting, but the “dead” tserver should be detected and reassigned in short order. The reassignment may cause some churn of assignments if the cluster becomes un-balanced.   The manager (master) will select tablets from tservers that are over-subscribed and then assign them to tservers that have fewer tablets – you can monitor the manager (master) debug log to see the migration progress.  If you want to be gentile, stop a tserver, wait for the number of unassigned tables to hit zero and migration to settle and then repeat.



If you want to stop the data nodes, you can do that independently of Accumulo – just follow the Hadoop data node decommission process.  Hadoop will move the data blocks assigned to the data node so that it is “safe” to then stop the data node process.  This is independent of Accumulo and Accumulo will not be aware that the blocks are moving.  If you are running compactions, Accumulo may try to write blocks locally, but if the data node is rejecting new block assignments (which I rather assume that it would when in decommission mode) then Accumulo still would not care.  If somehow new blocks where written it may just delay the Hadoop data node decommissioning.



If you are running ingest while killing tservers – things should mostly work – there may be ingest failures, but normally things would get retried and the subsequent effort should succeed – the issue may be that if by bad luck the work keeps getting assigned to tservers that are then killed, you could end up exceeding the number of retries and the ingest would fail out right.  If you can pause ingest, then this limits that chance.  If you can monitor your ingest and know when an ingest failed you could just reschedule the ingest (for bulk import)  If you are doing continuous ingest, it may be harder to determine if a specific ingest fails, so you may need to select an appropriate range for replay.  Overall it may mostly work – it will depend on your processes and your tolerance for any particular data loss on an ingest.



The modest approach (if you can accept transient errors):



1 Start the data node decommission process.

2 Pause ingest and cancel any running user compactions.

3 Stop a tserver and wait for unassigned tablets to go back to 0.  Wait for the tablet migration (if any) to quiet down.

4 Repeat 3 until all tserver processes have been stopped on the nodes you are removing.

5 Restart ingest – rerun any user compactions if you stopped any.

6 Wait for the hdfs decommission process to finish moving / replicating blocks.

7 stop the data node process.

8 do what you want with the node.



You do not need to schedule down time – if you can accept transient errors – say that a client scan is running and that tserver is stopped – the client may receive an error for the scan.  If the scan is resubmitted and the tablet has been reassigned it should work – it may pause for the reassignment and / or timeout if the assignment takes some time.   You are basically playing a number game here – the number of tablets, the number of unassigned tablets, the odds that a scan would be using a particular tablet for the duration that it is unavailable.  It’s not guaranteed that it will fail, its just that there is a greater than 0 chance that it could – if that is unacceptable then:



1 Stop ingest – wait for all to finish or mark which ones will need to be rescheduled

2 Stop Accumulo

3 Remove the tservers from the servers list

4 Start Accumulo without starting the decommissioned tserver nodes.



Do what you want with the data node decommissioning.



The later approach removes possible transient issues.  It is up to you to determine your tolerance for possible transient issues for the duration that tservers are being stopped vs a complete outage for the duration that Accumulo is down.  If it is a large cluster and just a few tservers, the odds of a specific tablet being off line for a short duration may be very low.  If it is a small cluster or the percentage of tservers that you are stopping is large then the odds increase, but the issues will still be transient.  You need to decide which is acceptable to you and your circumstances.



From: Shailesh Ligade <SL...@FBI.GOV>
Sent: Tuesday, August 17, 2021 11:26 AM
To: user@accumulo.apache.org
Subject: RE: how to decommission tablet server



It will be helpful to know that when you are decommissioning tablets (one at a time for underlying hdfs to replicate), do we need accumulo downtime? Can accumulo be ingesting while we are decommissioning tablets?



Thanks



-S



From: Shailesh Ligade <SL...@FBI.GOV>>
Sent: Tuesday, August 17, 2021 8:52 AM
To: user@accumulo.apache.org<ma...@accumulo.apache.org>
Subject: [EXTERNAL EMAIL] - how to decommission tablet server



Hello,



I am using accumulo 1.10 and want to remove few tablet server



I saw in the documentation that I need to run



accumulo admin stop <tserver>:9997



That command comes back quickly, not sure how long, if any I have to wait for before I stop tserver service? When is the time to stop datanode service (running on the same tablet server)? And when to update slaves files (for accumulo and hdfs)?



Any guidelines on this?



Thanks



-S

RE: how to decommission tablet server

Posted by de...@etcoleman.com.

Maybe you could clarify.  Decommissioning tablet servers and hdfs
replication are separate and distinct issues.  Accumulo will generally be
unaware of hdfs replication and table assignment does not change the hdfs
replication.  You can set the replication factor for a tablet - but that is
used on writes to hdfs - Accumulo will assume that on any successful write,
on return hdfs  is managing the details.

 

When a tablet is assigned / migrated, the underlying files in hdfs are not
changed - the file references are reassigned in a metadata operation, but
the files themselves are not modified.  They will maintain whatever
replication factor that was assigned and whatever the namenode decides.

 

If you are removing servers that have both data nodes and tserver processes
running: 

 

If you stop / kill the tserver, the tablets assigned to that server will be
reassigned rather quickly.  It is only an metadata update.  The exact timing
will depend on your ZooKeeper time-out setting, but the "dead" tserver
should be detected and reassigned in short order. The reassignment may cause
some churn of assignments if the cluster becomes un-balanced.   The manager
(master) will select tablets from tservers that are over-subscribed and then
assign them to tservers that have fewer tablets - you can monitor the
manager (master) debug log to see the migration progress.  If you want to be
gentile, stop a tserver, wait for the number of unassigned tables to hit
zero and migration to settle and then repeat.

 

If you want to stop the data nodes, you can do that independently of
Accumulo - just follow the Hadoop data node decommission process.  Hadoop
will move the data blocks assigned to the data node so that it is "safe" to
then stop the data node process.  This is independent of Accumulo and
Accumulo will not be aware that the blocks are moving.  If you are running
compactions, Accumulo may try to write blocks locally, but if the data node
is rejecting new block assignments (which I rather assume that it would when
in decommission mode) then Accumulo still would not care.  If somehow new
blocks where written it may just delay the Hadoop data node decommissioning.

 

If you are running ingest while killing tservers - things should mostly work
- there may be ingest failures, but normally things would get retried and
the subsequent effort should succeed - the issue may be that if by bad luck
the work keeps getting assigned to tservers that are then killed, you could
end up exceeding the number of retries and the ingest would fail out right.
If you can pause ingest, then this limits that chance.  If you can monitor
your ingest and know when an ingest failed you could just reschedule the
ingest (for bulk import)  If you are doing continuous ingest, it may be
harder to determine if a specific ingest fails, so you may need to select an
appropriate range for replay.  Overall it may mostly work - it will depend
on your processes and your tolerance for any particular data loss on an
ingest.

 

The modest approach (if you can accept transient errors):

 

1 Start the data node decommission process.

2 Pause ingest and cancel any running user compactions.

3 Stop a tserver and wait for unassigned tablets to go back to 0.  Wait for
the tablet migration (if any) to quiet down. 

4 Repeat 3 until all tserver processes have been stopped on the nodes you
are removing.

5 Restart ingest - rerun any user compactions if you stopped any.

6 Wait for the hdfs decommission process to finish moving / replicating
blocks.

7 stop the data node process.

8 do what you want with the node.

 

You do not need to schedule down time - if you can accept transient errors -
say that a client scan is running and that tserver is stopped - the client
may receive an error for the scan.  If the scan is resubmitted and the
tablet has been reassigned it should work - it may pause for the
reassignment and / or timeout if the assignment takes some time.   You are
basically playing a number game here - the number of tablets, the number of
unassigned tablets, the odds that a scan would be using a particular tablet
for the duration that it is unavailable.  It's not guaranteed that it will
fail, its just that there is a greater than 0 chance that it could - if that
is unacceptable then:

 

1 Stop ingest - wait for all to finish or mark which ones will need to be
rescheduled

2 Stop Accumulo

3 Remove the tservers from the servers list

4 Start Accumulo without starting the decommissioned tserver nodes.

 

Do what you want with the data node decommissioning.

 

The later approach removes possible transient issues.  It is up to you to
determine your tolerance for possible transient issues for the duration that
tservers are being stopped vs a complete outage for the duration that
Accumulo is down.  If it is a large cluster and just a few tservers, the
odds of a specific tablet being off line for a short duration may be very
low.  If it is a small cluster or the percentage of tservers that you are
stopping is large then the odds increase, but the issues will still be
transient.  You need to decide which is acceptable to you and your
circumstances.  

 

From: Shailesh Ligade <SL...@FBI.GOV> 
Sent: Tuesday, August 17, 2021 11:26 AM
To: user@accumulo.apache.org
Subject: RE: how to decommission tablet server

 

It will be helpful to know that when you are decommissioning tablets (one at
a time for underlying hdfs to replicate), do we need accumulo downtime? Can
accumulo be ingesting while we are decommissioning tablets?

 

Thanks

 

-S

 

From: Shailesh Ligade <SLIGADE@FBI.GOV <ma...@FBI.GOV> > 
Sent: Tuesday, August 17, 2021 8:52 AM
To: user@accumulo.apache.org <ma...@accumulo.apache.org> 
Subject: [EXTERNAL EMAIL] - how to decommission tablet server

 

Hello,

 

I am using accumulo 1.10 and want to remove few tablet server

 

I saw in the documentation that I need to run

 

accumulo admin stop <tserver>:9997

 

That command comes back quickly, not sure how long, if any I have to wait
for before I stop tserver service? When is the time to stop datanode service
(running on the same tablet server)? And when to update slaves files (for
accumulo and hdfs)?

 

Any guidelines on this?

 

Thanks

 

-S

RE: how to decommission tablet server

Posted by Shailesh Ligade <SL...@FBI.GOV>.

It will be helpful to know that when you are decommissioning tablets (one at a time for underlying hdfs to replicate), do we need accumulo downtime? Can accumulo be ingesting while we are decommissioning tablets?

Thanks

-S

From: Shailesh Ligade <SL...@FBI.GOV>
Sent: Tuesday, August 17, 2021 8:52 AM
To: user@accumulo.apache.org
Subject: [EXTERNAL EMAIL] - how to decommission tablet server

Hello,

I am using accumulo 1.10 and want to remove few tablet server

I saw in the documentation that I need to run

accumulo admin stop <tserver>:9997

That command comes back quickly, not sure how long, if any I have to wait for before I stop tserver service? When is the time to stop datanode service (running on the same tablet server)? And when to update slaves files (for accumulo and hdfs)?

Any guidelines on this?

Thanks

-S