You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Steven Troxell <st...@gmail.com> on 2012/08/06 16:25:09 UTC

Time requirement between shutting down tablet servers?

Is there a problem with shutting down tablet servers in quick succession?
I am attempting to scale back from 10 tservers to 2 for benchmark testing,
but I am running into problems where the at some point, the monitor stops
showing the remaining servers (that I hadn't gotten to kill yet) as
online.  I see numerous Connection refused, and unable to recover errors in
my logs, but there's no consistency as to after how many servers shut down
that I lose everythying.  The only thing I've picked up on is higher
success rates, when I leave larger gaps of time in between shutting
servers.  Is this reasonable/expected behavior?

I am using the bin/stop-here.sh command to kill servers.  Alternatively I
have tried ./bin/stop-all.sh,  then running ./bin/start-here.sh  on master
and individual tablet servers I want running, but that doesn't seem to
bring them up

Thanks,
Steve

Re: Time requirement between shutting down tablet servers?

Posted by David Medinets <da...@gmail.com>.
On Mon, Aug 6, 2012 at 1:01 PM, John Vines <vi...@apache.org> wrote:
> Perhaps we should direct stop-here.sh to utilize admin stop. Or at the very
> least rename stop-here to kill-here to make it clear that it's rough around
> the edges.

I like the name change that indicates intent.

Re: Time requirement between shutting down tablet servers?

Posted by John Vines <vi...@apache.org>.
Perhaps we should direct stop-here.sh to utilize admin stop. Or at the very
least rename stop-here to kill-here to make it clear that it's rough around
the edges.

John

On Mon, Aug 6, 2012 at 12:28 PM, Eric Newton <er...@gmail.com> wrote:

> You are killing loggers, which means that recovery cannot take place with
> tablets are moved to the remaining servers.
>
> Try:
>
> $ ./bin/accumulo admin stop host:port
>
> This will gracefully stop the tserver and logger on that machine, and
> flush the tablets with references to logs on that machine.
>
> -Eric
>
>
> On Mon, Aug 6, 2012 at 10:25 AM, Steven Troxell <st...@gmail.com>wrote:
>
>> Is there a problem with shutting down tablet servers in quick
>> succession?  I am attempting to scale back from 10 tservers to 2 for
>> benchmark testing, but I am running into problems where the at some point,
>> the monitor stops showing the remaining servers (that I hadn't gotten to
>> kill yet) as online.  I see numerous Connection refused, and unable to
>> recover errors in my logs, but there's no consistency as to after how many
>> servers shut down that I lose everythying.  The only thing I've picked up
>> on is higher success rates, when I leave larger gaps of time in between
>> shutting servers.  Is this reasonable/expected behavior?
>>
>> I am using the bin/stop-here.sh command to kill servers.  Alternatively I
>> have tried ./bin/stop-all.sh,  then running ./bin/start-here.sh  on master
>> and individual tablet servers I want running, but that doesn't seem to
>> bring them up
>>
>> Thanks,
>> Steve
>>
>
>

Re: Time requirement between shutting down tablet servers?

Posted by Steven Troxell <st...@gmail.com>.
Thanks Eric,

I think that's the command I was trying to recall that Adam gave me
initially, it looks familiar anyway.  I don't remember where/why I switched
from using that to stop-here.sh



On Mon, Aug 6, 2012 at 12:28 PM, Eric Newton <er...@gmail.com> wrote:

> You are killing loggers, which means that recovery cannot take place with
> tablets are moved to the remaining servers.
>
> Try:
>
> $ ./bin/accumulo admin stop host:port
>
> This will gracefully stop the tserver and logger on that machine, and
> flush the tablets with references to logs on that machine.
>
> -Eric
>
>
> On Mon, Aug 6, 2012 at 10:25 AM, Steven Troxell <st...@gmail.com>wrote:
>
>> Is there a problem with shutting down tablet servers in quick
>> succession?  I am attempting to scale back from 10 tservers to 2 for
>> benchmark testing, but I am running into problems where the at some point,
>> the monitor stops showing the remaining servers (that I hadn't gotten to
>> kill yet) as online.  I see numerous Connection refused, and unable to
>> recover errors in my logs, but there's no consistency as to after how many
>> servers shut down that I lose everythying.  The only thing I've picked up
>> on is higher success rates, when I leave larger gaps of time in between
>> shutting servers.  Is this reasonable/expected behavior?
>>
>> I am using the bin/stop-here.sh command to kill servers.  Alternatively I
>> have tried ./bin/stop-all.sh,  then running ./bin/start-here.sh  on master
>> and individual tablet servers I want running, but that doesn't seem to
>> bring them up
>>
>> Thanks,
>> Steve
>>
>
>

Re: Time requirement between shutting down tablet servers?

Posted by Eric Newton <er...@gmail.com>.
You are killing loggers, which means that recovery cannot take place with
tablets are moved to the remaining servers.

Try:

$ ./bin/accumulo admin stop host:port

This will gracefully stop the tserver and logger on that machine, and flush
the tablets with references to logs on that machine.

-Eric

On Mon, Aug 6, 2012 at 10:25 AM, Steven Troxell <st...@gmail.com>wrote:

> Is there a problem with shutting down tablet servers in quick succession?
> I am attempting to scale back from 10 tservers to 2 for benchmark testing,
> but I am running into problems where the at some point, the monitor stops
> showing the remaining servers (that I hadn't gotten to kill yet) as
> online.  I see numerous Connection refused, and unable to recover errors in
> my logs, but there's no consistency as to after how many servers shut down
> that I lose everythying.  The only thing I've picked up on is higher
> success rates, when I leave larger gaps of time in between shutting
> servers.  Is this reasonable/expected behavior?
>
> I am using the bin/stop-here.sh command to kill servers.  Alternatively I
> have tried ./bin/stop-all.sh,  then running ./bin/start-here.sh  on master
> and individual tablet servers I want running, but that doesn't seem to
> bring them up
>
> Thanks,
> Steve
>