You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Denis <de...@camfex.cz> on 2015/01/10 16:31:05 UTC

Offline tables on adding a tserver (Accumulo 1.6 regression?)

Hi

I recently upgraded my Accumulo cluster from 1.4 to 1.6 and noticed a
regression.

Removing a tserver makes puts some tablets offline for a while until
other tservers start handling them, that's normal.

But with 1.6 the same happens on adding a tserver as well.
Is it ok?

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Josh Elser <jo...@gmail.com>.
Yes, presently our logging infrastructure doesn't work well with 
"standard" log4j configuration (I think we have a ticket floating around 
somewhere, but I don't recall the ID off the top of my head).

If your concern is only with using a properties configuration file 
instead of XML, ACCUMULO-2383 introduced the means to do so. For the 
time being, you'll have to use generic_logger.(xml|properties) as a way 
to configure log4j. You can also create specific 
*_logger.(xml|properties) to configure certain components if necessary 
(e.g. tserver_logger.xml or master_logger.xml).

Denis wrote:
> Well, that depends on what is in -Dlog4j.configuration=
>
> If there is an URI
> (-Dlog4j.configuration=file:/path/to/log4j.configuration) as log4j
> expects, Accumulo stops file-based logging as soon as it starts log
> forwarding.
>
> If there is a path
> (-Dlog4j.configuration=/path/to/log4j.configuration) then Accumulo is
> happy but log4j blames "Please initialize the log4j system properly"
> and file-based logging does not start until log forwarding is enabled.
>
> It is better not to use -Dlog4j.configuration at all but those
> tservers are running with
> -Dlog4j.configuration=file:/path/to/log4j.configuration :(
>
>
> On 1/13/15, Josh Elser<jo...@gmail.com>  wrote:
>> Denis wrote:
>>>> Do you have any warnings/errors in the new server's logs?
>>> On smaller cluster where I try to reproduce the problem - no
>>>
>>> On big cluster, unfortunately, there are no local logs as the tserver
>>> logs were sent to the monitor:(
>>> At the moment I cannot add a new tserver there to collect new logs as
>>> clients are using the cluster.
>> Huh? Log-forwarding to the monitor doesn't preclude local file-based
>> logging. You can have both.
>>

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Denis <de...@camfex.cz>.
Well, that depends on what is in -Dlog4j.configuration=

If there is an URI
(-Dlog4j.configuration=file:/path/to/log4j.configuration) as log4j
expects, Accumulo stops file-based logging as soon as it starts log
forwarding.

If there is a path
(-Dlog4j.configuration=/path/to/log4j.configuration) then Accumulo is
happy but log4j blames "Please initialize the log4j system properly"
and file-based logging does not start until log forwarding is enabled.

It is better not to use -Dlog4j.configuration at all but those
tservers are running with
-Dlog4j.configuration=file:/path/to/log4j.configuration :(


On 1/13/15, Josh Elser <jo...@gmail.com> wrote:
> Denis wrote:
>>> Do you have any warnings/errors in the new server's logs?
>>
>> On smaller cluster where I try to reproduce the problem - no
>>
>> On big cluster, unfortunately, there are no local logs as the tserver
>> logs were sent to the monitor:(
>> At the moment I cannot add a new tserver there to collect new logs as
>> clients are using the cluster.
>
> Huh? Log-forwarding to the monitor doesn't preclude local file-based
> logging. You can have both.
>

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Josh Elser <jo...@gmail.com>.
Denis wrote:
>> Do you have any warnings/errors in the new server's logs?
>
> On smaller cluster where I try to reproduce the problem - no
>
> On big cluster, unfortunately, there are no local logs as the tserver
> logs were sent to the monitor:(
> At the moment I cannot add a new tserver there to collect new logs as
> clients are using the cluster.

Huh? Log-forwarding to the monitor doesn't preclude local file-based 
logging. You can have both.

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Denis <de...@camfex.cz>.
> Do you have any warnings/errors in the new server's logs?

On smaller cluster where I try to reproduce the problem - no

On big cluster, unfortunately, there are no local logs as the tserver
logs were sent to the monitor :(
At the moment I cannot add a new tserver there to collect new logs as
clients are using the cluster.

On 1/13/15, Eric Newton <er...@gmail.com> wrote:
> The fact that the tablets are being taken offline means that the master is
> actively trying to balance.
>
> The master will periodically ask the new server to host the tablets.  Do
> you have any warnings/errors in the new server's logs?
>
> -Eric
>
>
> On Tue, Jan 13, 2015 at 11:48 AM, Denis <de...@camfex.cz> wrote:
>
>> >  If you jstack your new tablet server, does it show a deadlock?
>>
>> No
>>
>> On 1/13/15, Eric Newton <er...@gmail.com> wrote:
>> > This may be a result of ACCUMULO-3372.  If you jstack your new tablet
>> > server, does it show a deadlock?
>> >
>> > $ jps -m
>> > 12345 Main tserver --address host:9997
>> >
>> > $ jstack 12345 | grep -i deadlock
>> > Deadlock detected
>> >
>> > This particular bug only happens at start-up.  There's a trivial patch
>> > (which you can find through the bug report), which will be in accumulo
>> > 1.6.2.
>> >
>> > -Eric
>> >
>> >
>> > On Mon, Jan 12, 2015 at 4:06 PM, Denis <de...@camfex.cz> wrote:
>> >
>> >> I have not tried yet anything newer than 1.6.1
>> >>
>> >> On 1/12/15, Josh Elser <el...@apache.org> wrote:
>> >> > Denis wrote:
>> >> >> created https://issues.apache.org/jira/browse/ACCUMULO-3471
>> >> >
>> >> > Thanks a bunch!
>> >> >
>> >> >> BTW, In 1.6.1 also balancing may get stuck until the master server
>> >> >> is
>> >> >> restarted.
>> >> >
>> >> > Is this a known issue in 1.6.1 that's been since fixed or is it
>> >> > still
>> >> > outstanding?
>> >> >
>> >> >> But then, after the master restart, balancing works very
>> >> >> "aggressively", putting many tablets offline for quite long time
>> >> >> (minutes)
>> >> >>
>> >> >> On 1/11/15, Denis<de...@camfex.cz>  wrote:
>> >> >>> Sometimes it left unbalanced with new tserver hosts zero tablets
>> >> >>> or
>> >> >>> much less that others.
>> >> >>> So I had to restart master to initiate the balancing process.
>> >> >>> Then balancing was performed slowly without putting thousands of
>> >> >>> tablets offline.
>> >> >>>
>> >> >>> On 1/11/15, John Vines<vi...@apache.org>  wrote:
>> >> >>>> I have a hunch that the 1.4 version being used possibly had one
>> >> >>>> or
>> >> more
>> >> >>>> of
>> >> >>>> the many bugs regarding balancing getting 'stuck', which was
>> >> >>>> typically
>> >> >>>> resolved via bouncing the master. Denis, in 1.4 when you brought
>> you
>> >> >>>> tserver back online, did you find that things were then balanced
>> >> >>>> or
>> >> did
>> >> >>>> you
>> >> >>>> just have a tserver up and things were left unbalanced?
>> >> >>>>
>> >> >>>> On Sun, Jan 11, 2015 at 11:30 AM, Denis<de...@camfex.cz>  wrote:
>> >> >>>>
>> >> >>>>> yes, per server
>> >> >>>>>
>> >> >>>>> On 1/11/15, Sean Busbey<bu...@cloudera.com>  wrote:
>> >> >>>>>> On Sat, Jan 10, 2015 at 3:42 PM, Denis<de...@camfex.cz>  wrote:
>> >> >>>>>> On 1/10/15, Christopher<ct...@apache.org>  wrote:
>> >> >>>>>>
>> >> >>>>>>> ...
>> >> >>>>>>> 3) how many tablets do you have per server?....
>> >> >>>>>> 3. about 6000
>> >> >>>>>>
>> >> >>>>>> Just to confirm, this is 6000 tablets per-server and not 6000
>> >> tablets
>> >> >>>>>> per-table or overall, right?
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> --
>> >> >>>>>> Sean
>> >> >>>>>>
>> >> >
>> >>
>> >
>>
>

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Eric Newton <er...@gmail.com>.
The fact that the tablets are being taken offline means that the master is
actively trying to balance.

The master will periodically ask the new server to host the tablets.  Do
you have any warnings/errors in the new server's logs?

-Eric


On Tue, Jan 13, 2015 at 11:48 AM, Denis <de...@camfex.cz> wrote:

> >  If you jstack your new tablet server, does it show a deadlock?
>
> No
>
> On 1/13/15, Eric Newton <er...@gmail.com> wrote:
> > This may be a result of ACCUMULO-3372.  If you jstack your new tablet
> > server, does it show a deadlock?
> >
> > $ jps -m
> > 12345 Main tserver --address host:9997
> >
> > $ jstack 12345 | grep -i deadlock
> > Deadlock detected
> >
> > This particular bug only happens at start-up.  There's a trivial patch
> > (which you can find through the bug report), which will be in accumulo
> > 1.6.2.
> >
> > -Eric
> >
> >
> > On Mon, Jan 12, 2015 at 4:06 PM, Denis <de...@camfex.cz> wrote:
> >
> >> I have not tried yet anything newer than 1.6.1
> >>
> >> On 1/12/15, Josh Elser <el...@apache.org> wrote:
> >> > Denis wrote:
> >> >> created https://issues.apache.org/jira/browse/ACCUMULO-3471
> >> >
> >> > Thanks a bunch!
> >> >
> >> >> BTW, In 1.6.1 also balancing may get stuck until the master server is
> >> >> restarted.
> >> >
> >> > Is this a known issue in 1.6.1 that's been since fixed or is it still
> >> > outstanding?
> >> >
> >> >> But then, after the master restart, balancing works very
> >> >> "aggressively", putting many tablets offline for quite long time
> >> >> (minutes)
> >> >>
> >> >> On 1/11/15, Denis<de...@camfex.cz>  wrote:
> >> >>> Sometimes it left unbalanced with new tserver hosts zero tablets or
> >> >>> much less that others.
> >> >>> So I had to restart master to initiate the balancing process.
> >> >>> Then balancing was performed slowly without putting thousands of
> >> >>> tablets offline.
> >> >>>
> >> >>> On 1/11/15, John Vines<vi...@apache.org>  wrote:
> >> >>>> I have a hunch that the 1.4 version being used possibly had one or
> >> more
> >> >>>> of
> >> >>>> the many bugs regarding balancing getting 'stuck', which was
> >> >>>> typically
> >> >>>> resolved via bouncing the master. Denis, in 1.4 when you brought
> you
> >> >>>> tserver back online, did you find that things were then balanced or
> >> did
> >> >>>> you
> >> >>>> just have a tserver up and things were left unbalanced?
> >> >>>>
> >> >>>> On Sun, Jan 11, 2015 at 11:30 AM, Denis<de...@camfex.cz>  wrote:
> >> >>>>
> >> >>>>> yes, per server
> >> >>>>>
> >> >>>>> On 1/11/15, Sean Busbey<bu...@cloudera.com>  wrote:
> >> >>>>>> On Sat, Jan 10, 2015 at 3:42 PM, Denis<de...@camfex.cz>  wrote:
> >> >>>>>> On 1/10/15, Christopher<ct...@apache.org>  wrote:
> >> >>>>>>
> >> >>>>>>> ...
> >> >>>>>>> 3) how many tablets do you have per server?....
> >> >>>>>> 3. about 6000
> >> >>>>>>
> >> >>>>>> Just to confirm, this is 6000 tablets per-server and not 6000
> >> tablets
> >> >>>>>> per-table or overall, right?
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> --
> >> >>>>>> Sean
> >> >>>>>>
> >> >
> >>
> >
>

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Denis <de...@camfex.cz>.
>  If you jstack your new tablet server, does it show a deadlock?

No

On 1/13/15, Eric Newton <er...@gmail.com> wrote:
> This may be a result of ACCUMULO-3372.  If you jstack your new tablet
> server, does it show a deadlock?
>
> $ jps -m
> 12345 Main tserver --address host:9997
>
> $ jstack 12345 | grep -i deadlock
> Deadlock detected
>
> This particular bug only happens at start-up.  There's a trivial patch
> (which you can find through the bug report), which will be in accumulo
> 1.6.2.
>
> -Eric
>
>
> On Mon, Jan 12, 2015 at 4:06 PM, Denis <de...@camfex.cz> wrote:
>
>> I have not tried yet anything newer than 1.6.1
>>
>> On 1/12/15, Josh Elser <el...@apache.org> wrote:
>> > Denis wrote:
>> >> created https://issues.apache.org/jira/browse/ACCUMULO-3471
>> >
>> > Thanks a bunch!
>> >
>> >> BTW, In 1.6.1 also balancing may get stuck until the master server is
>> >> restarted.
>> >
>> > Is this a known issue in 1.6.1 that's been since fixed or is it still
>> > outstanding?
>> >
>> >> But then, after the master restart, balancing works very
>> >> "aggressively", putting many tablets offline for quite long time
>> >> (minutes)
>> >>
>> >> On 1/11/15, Denis<de...@camfex.cz>  wrote:
>> >>> Sometimes it left unbalanced with new tserver hosts zero tablets or
>> >>> much less that others.
>> >>> So I had to restart master to initiate the balancing process.
>> >>> Then balancing was performed slowly without putting thousands of
>> >>> tablets offline.
>> >>>
>> >>> On 1/11/15, John Vines<vi...@apache.org>  wrote:
>> >>>> I have a hunch that the 1.4 version being used possibly had one or
>> more
>> >>>> of
>> >>>> the many bugs regarding balancing getting 'stuck', which was
>> >>>> typically
>> >>>> resolved via bouncing the master. Denis, in 1.4 when you brought you
>> >>>> tserver back online, did you find that things were then balanced or
>> did
>> >>>> you
>> >>>> just have a tserver up and things were left unbalanced?
>> >>>>
>> >>>> On Sun, Jan 11, 2015 at 11:30 AM, Denis<de...@camfex.cz>  wrote:
>> >>>>
>> >>>>> yes, per server
>> >>>>>
>> >>>>> On 1/11/15, Sean Busbey<bu...@cloudera.com>  wrote:
>> >>>>>> On Sat, Jan 10, 2015 at 3:42 PM, Denis<de...@camfex.cz>  wrote:
>> >>>>>> On 1/10/15, Christopher<ct...@apache.org>  wrote:
>> >>>>>>
>> >>>>>>> ...
>> >>>>>>> 3) how many tablets do you have per server?....
>> >>>>>> 3. about 6000
>> >>>>>>
>> >>>>>> Just to confirm, this is 6000 tablets per-server and not 6000
>> tablets
>> >>>>>> per-table or overall, right?
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Sean
>> >>>>>>
>> >
>>
>

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Eric Newton <er...@gmail.com>.
This may be a result of ACCUMULO-3372.  If you jstack your new tablet
server, does it show a deadlock?

$ jps -m
12345 Main tserver --address host:9997

$ jstack 12345 | grep -i deadlock
Deadlock detected

This particular bug only happens at start-up.  There's a trivial patch
(which you can find through the bug report), which will be in accumulo
1.6.2.

-Eric


On Mon, Jan 12, 2015 at 4:06 PM, Denis <de...@camfex.cz> wrote:

> I have not tried yet anything newer than 1.6.1
>
> On 1/12/15, Josh Elser <el...@apache.org> wrote:
> > Denis wrote:
> >> created https://issues.apache.org/jira/browse/ACCUMULO-3471
> >
> > Thanks a bunch!
> >
> >> BTW, In 1.6.1 also balancing may get stuck until the master server is
> >> restarted.
> >
> > Is this a known issue in 1.6.1 that's been since fixed or is it still
> > outstanding?
> >
> >> But then, after the master restart, balancing works very
> >> "aggressively", putting many tablets offline for quite long time
> >> (minutes)
> >>
> >> On 1/11/15, Denis<de...@camfex.cz>  wrote:
> >>> Sometimes it left unbalanced with new tserver hosts zero tablets or
> >>> much less that others.
> >>> So I had to restart master to initiate the balancing process.
> >>> Then balancing was performed slowly without putting thousands of
> >>> tablets offline.
> >>>
> >>> On 1/11/15, John Vines<vi...@apache.org>  wrote:
> >>>> I have a hunch that the 1.4 version being used possibly had one or
> more
> >>>> of
> >>>> the many bugs regarding balancing getting 'stuck', which was typically
> >>>> resolved via bouncing the master. Denis, in 1.4 when you brought you
> >>>> tserver back online, did you find that things were then balanced or
> did
> >>>> you
> >>>> just have a tserver up and things were left unbalanced?
> >>>>
> >>>> On Sun, Jan 11, 2015 at 11:30 AM, Denis<de...@camfex.cz>  wrote:
> >>>>
> >>>>> yes, per server
> >>>>>
> >>>>> On 1/11/15, Sean Busbey<bu...@cloudera.com>  wrote:
> >>>>>> On Sat, Jan 10, 2015 at 3:42 PM, Denis<de...@camfex.cz>  wrote:
> >>>>>> On 1/10/15, Christopher<ct...@apache.org>  wrote:
> >>>>>>
> >>>>>>> ...
> >>>>>>> 3) how many tablets do you have per server?....
> >>>>>> 3. about 6000
> >>>>>>
> >>>>>> Just to confirm, this is 6000 tablets per-server and not 6000
> tablets
> >>>>>> per-table or overall, right?
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Sean
> >>>>>>
> >
>

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Denis <de...@camfex.cz>.
I have not tried yet anything newer than 1.6.1

On 1/12/15, Josh Elser <el...@apache.org> wrote:
> Denis wrote:
>> created https://issues.apache.org/jira/browse/ACCUMULO-3471
>
> Thanks a bunch!
>
>> BTW, In 1.6.1 also balancing may get stuck until the master server is
>> restarted.
>
> Is this a known issue in 1.6.1 that's been since fixed or is it still
> outstanding?
>
>> But then, after the master restart, balancing works very
>> "aggressively", putting many tablets offline for quite long time
>> (minutes)
>>
>> On 1/11/15, Denis<de...@camfex.cz>  wrote:
>>> Sometimes it left unbalanced with new tserver hosts zero tablets or
>>> much less that others.
>>> So I had to restart master to initiate the balancing process.
>>> Then balancing was performed slowly without putting thousands of
>>> tablets offline.
>>>
>>> On 1/11/15, John Vines<vi...@apache.org>  wrote:
>>>> I have a hunch that the 1.4 version being used possibly had one or more
>>>> of
>>>> the many bugs regarding balancing getting 'stuck', which was typically
>>>> resolved via bouncing the master. Denis, in 1.4 when you brought you
>>>> tserver back online, did you find that things were then balanced or did
>>>> you
>>>> just have a tserver up and things were left unbalanced?
>>>>
>>>> On Sun, Jan 11, 2015 at 11:30 AM, Denis<de...@camfex.cz>  wrote:
>>>>
>>>>> yes, per server
>>>>>
>>>>> On 1/11/15, Sean Busbey<bu...@cloudera.com>  wrote:
>>>>>> On Sat, Jan 10, 2015 at 3:42 PM, Denis<de...@camfex.cz>  wrote:
>>>>>> On 1/10/15, Christopher<ct...@apache.org>  wrote:
>>>>>>
>>>>>>> ...
>>>>>>> 3) how many tablets do you have per server?....
>>>>>> 3. about 6000
>>>>>>
>>>>>> Just to confirm, this is 6000 tablets per-server and not 6000 tablets
>>>>>> per-table or overall, right?
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sean
>>>>>>
>

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Josh Elser <el...@apache.org>.
Denis wrote:
> created https://issues.apache.org/jira/browse/ACCUMULO-3471

Thanks a bunch!

> BTW, In 1.6.1 also balancing may get stuck until the master server is restarted.

Is this a known issue in 1.6.1 that's been since fixed or is it still 
outstanding?

> But then, after the master restart, balancing works very
> "aggressively", putting many tablets offline for quite long time
> (minutes)
>
> On 1/11/15, Denis<de...@camfex.cz>  wrote:
>> Sometimes it left unbalanced with new tserver hosts zero tablets or
>> much less that others.
>> So I had to restart master to initiate the balancing process.
>> Then balancing was performed slowly without putting thousands of
>> tablets offline.
>>
>> On 1/11/15, John Vines<vi...@apache.org>  wrote:
>>> I have a hunch that the 1.4 version being used possibly had one or more
>>> of
>>> the many bugs regarding balancing getting 'stuck', which was typically
>>> resolved via bouncing the master. Denis, in 1.4 when you brought you
>>> tserver back online, did you find that things were then balanced or did
>>> you
>>> just have a tserver up and things were left unbalanced?
>>>
>>> On Sun, Jan 11, 2015 at 11:30 AM, Denis<de...@camfex.cz>  wrote:
>>>
>>>> yes, per server
>>>>
>>>> On 1/11/15, Sean Busbey<bu...@cloudera.com>  wrote:
>>>>> On Sat, Jan 10, 2015 at 3:42 PM, Denis<de...@camfex.cz>  wrote:
>>>>> On 1/10/15, Christopher<ct...@apache.org>  wrote:
>>>>>
>>>>>> ...
>>>>>> 3) how many tablets do you have per server?....
>>>>> 3. about 6000
>>>>>
>>>>> Just to confirm, this is 6000 tablets per-server and not 6000 tablets
>>>>> per-table or overall, right?
>>>>>
>>>>>
>>>>> --
>>>>> Sean
>>>>>

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Denis <de...@camfex.cz>.
created https://issues.apache.org/jira/browse/ACCUMULO-3471

BTW, In 1.6.1 also balancing may get stuck until the master server is restarted.
But then, after the master restart, balancing works very
"aggressively", putting many tablets offline for quite long time
(minutes)

On 1/11/15, Denis <de...@camfex.cz> wrote:
> Sometimes it left unbalanced with new tserver hosts zero tablets or
> much less that others.
> So I had to restart master to initiate the balancing process.
> Then balancing was performed slowly without putting thousands of
> tablets offline.
>
> On 1/11/15, John Vines <vi...@apache.org> wrote:
>> I have a hunch that the 1.4 version being used possibly had one or more
>> of
>> the many bugs regarding balancing getting 'stuck', which was typically
>> resolved via bouncing the master. Denis, in 1.4 when you brought you
>> tserver back online, did you find that things were then balanced or did
>> you
>> just have a tserver up and things were left unbalanced?
>>
>> On Sun, Jan 11, 2015 at 11:30 AM, Denis <de...@camfex.cz> wrote:
>>
>>> yes, per server
>>>
>>> On 1/11/15, Sean Busbey <bu...@cloudera.com> wrote:
>>> > On Sat, Jan 10, 2015 at 3:42 PM, Denis <de...@camfex.cz> wrote:
>>> > On 1/10/15, Christopher <ct...@apache.org> wrote:
>>> >
>>> >> ...
>>> >
>>> >> 3) how many tablets do you have per server?....
>>> >
>>> > 3. about 6000
>>> >>
>>> >
>>> >
>>> > Just to confirm, this is 6000 tablets per-server and not 6000 tablets
>>> > per-table or overall, right?
>>> >
>>> >
>>> > --
>>> > Sean
>>> >
>>>
>>
>

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Denis <de...@camfex.cz>.
Sometimes it left unbalanced with new tserver hosts zero tablets or
much less that others.
So I had to restart master to initiate the balancing process.
Then balancing was performed slowly without putting thousands of
tablets offline.

On 1/11/15, John Vines <vi...@apache.org> wrote:
> I have a hunch that the 1.4 version being used possibly had one or more of
> the many bugs regarding balancing getting 'stuck', which was typically
> resolved via bouncing the master. Denis, in 1.4 when you brought you
> tserver back online, did you find that things were then balanced or did you
> just have a tserver up and things were left unbalanced?
>
> On Sun, Jan 11, 2015 at 11:30 AM, Denis <de...@camfex.cz> wrote:
>
>> yes, per server
>>
>> On 1/11/15, Sean Busbey <bu...@cloudera.com> wrote:
>> > On Sat, Jan 10, 2015 at 3:42 PM, Denis <de...@camfex.cz> wrote:
>> > On 1/10/15, Christopher <ct...@apache.org> wrote:
>> >
>> >> ...
>> >
>> >> 3) how many tablets do you have per server?....
>> >
>> > 3. about 6000
>> >>
>> >
>> >
>> > Just to confirm, this is 6000 tablets per-server and not 6000 tablets
>> > per-table or overall, right?
>> >
>> >
>> > --
>> > Sean
>> >
>>
>

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by John Vines <vi...@apache.org>.
I have a hunch that the 1.4 version being used possibly had one or more of
the many bugs regarding balancing getting 'stuck', which was typically
resolved via bouncing the master. Denis, in 1.4 when you brought you
tserver back online, did you find that things were then balanced or did you
just have a tserver up and things were left unbalanced?

On Sun, Jan 11, 2015 at 11:30 AM, Denis <de...@camfex.cz> wrote:

> yes, per server
>
> On 1/11/15, Sean Busbey <bu...@cloudera.com> wrote:
> > On Sat, Jan 10, 2015 at 3:42 PM, Denis <de...@camfex.cz> wrote:
> > On 1/10/15, Christopher <ct...@apache.org> wrote:
> >
> >> ...
> >
> >> 3) how many tablets do you have per server?....
> >
> > 3. about 6000
> >>
> >
> >
> > Just to confirm, this is 6000 tablets per-server and not 6000 tablets
> > per-table or overall, right?
> >
> >
> > --
> > Sean
> >
>

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Denis <de...@camfex.cz>.
yes, per server

On 1/11/15, Sean Busbey <bu...@cloudera.com> wrote:
> On Sat, Jan 10, 2015 at 3:42 PM, Denis <de...@camfex.cz> wrote:
> On 1/10/15, Christopher <ct...@apache.org> wrote:
>
>> ...
>
>> 3) how many tablets do you have per server?....
>
> 3. about 6000
>>
>
>
> Just to confirm, this is 6000 tablets per-server and not 6000 tablets
> per-table or overall, right?
>
>
> --
> Sean
>

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Sean Busbey <bu...@cloudera.com>.
On Sat, Jan 10, 2015 at 3:42 PM, Denis <de...@camfex.cz> wrote:
On 1/10/15, Christopher <ct...@apache.org> wrote:

> ...

> 3) how many tablets do you have per server?....

3. about 6000
>


Just to confirm, this is 6000 tablets per-server and not 6000 tablets
per-table or overall, right?


-- 
Sean

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Josh Elser <jo...@gmail.com>.
Denis, could you please open an issue on JIRA with any specifics that 
you have? That would help us make sure that it doesn't get lost 
(especially since we were considering releasing a 1.6.2 soon). Thanks 
for asking, too.

Christopher wrote:
> Minutes at a time is a lot of time. I think Eric Newton was looking at
> some performance issues with assignments. This could be related to that.
>
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
> On Sat, Jan 10, 2015 at 4:42 PM, Denis <denis@camfex.cz
> <ma...@camfex.cz>> wrote:
>
>     1. client requests are timed out. also http://monitor/tables shows
>     number of offline tablets for each table.
>     2. few minutes (up to 10)
>     3. about 6000
>     4. yes
>     5. I do not remember the problem with accumulo 1.4.
>
>     On 1/10/15, Christopher <ctubbsii@apache.org
>     <ma...@apache.org>> wrote:
>      > Adding a new tserver creates an imbalanced situation, where
>     tablets are not
>      > spread evenly across tablet servers. The tablet balancer in the
>     master
>      > server occasionally rebalances tablets. During the short period
>     of time
>      > those tablets are migrating, they will be temporarily offline.
>     That should
>      > have always been the case and would be perfectly normal.
>      >
>      > I'm curious:
>      > 1) how did you detect these were offline?
>      > 2) how long were they offline?
>      > 3) how many tablets do you have per server?
>      > 4) are you using the default balancers?
>      > 5) in what sense do you mean "regression"? are you thinking this
>     is linked
>      > to a previous bug/issue?
>      >
>      >
>      >
>      > --
>      > Christopher L Tubbs II
>      > http://gravatar.com/ctubbsii
>      >
>      > On Sat, Jan 10, 2015 at 10:31 AM, Denis <denis@camfex.cz
>     <ma...@camfex.cz>> wrote:
>      >
>      >> Hi
>      >>
>      >> I recently upgraded my Accumulo cluster from 1.4 to 1.6 and
>     noticed a
>      >> regression.
>      >>
>      >> Removing a tserver makes puts some tablets offline for a while until
>      >> other tservers start handling them, that's normal.
>      >>
>      >> But with 1.6 the same happens on adding a tserver as well.
>      >> Is it ok?
>      >>
>      >
>
>

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Christopher <ct...@apache.org>.
Minutes at a time is a lot of time. I think Eric Newton was looking at some
performance issues with assignments. This could be related to that.


--
Christopher L Tubbs II
http://gravatar.com/ctubbsii

On Sat, Jan 10, 2015 at 4:42 PM, Denis <de...@camfex.cz> wrote:

> 1. client requests are timed out. also http://monitor/tables shows
> number of offline tablets for each table.
> 2. few minutes (up to 10)
> 3. about 6000
> 4. yes
> 5. I do not remember the problem with accumulo 1.4.
>
> On 1/10/15, Christopher <ct...@apache.org> wrote:
> > Adding a new tserver creates an imbalanced situation, where tablets are
> not
> > spread evenly across tablet servers. The tablet balancer in the master
> > server occasionally rebalances tablets. During the short period of time
> > those tablets are migrating, they will be temporarily offline. That
> should
> > have always been the case and would be perfectly normal.
> >
> > I'm curious:
> > 1) how did you detect these were offline?
> > 2) how long were they offline?
> > 3) how many tablets do you have per server?
> > 4) are you using the default balancers?
> > 5) in what sense do you mean "regression"? are you thinking this is
> linked
> > to a previous bug/issue?
> >
> >
> >
> > --
> > Christopher L Tubbs II
> > http://gravatar.com/ctubbsii
> >
> > On Sat, Jan 10, 2015 at 10:31 AM, Denis <de...@camfex.cz> wrote:
> >
> >> Hi
> >>
> >> I recently upgraded my Accumulo cluster from 1.4 to 1.6 and noticed a
> >> regression.
> >>
> >> Removing a tserver makes puts some tablets offline for a while until
> >> other tservers start handling them, that's normal.
> >>
> >> But with 1.6 the same happens on adding a tserver as well.
> >> Is it ok?
> >>
> >
>

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Denis <de...@camfex.cz>.
1. client requests are timed out. also http://monitor/tables shows
number of offline tablets for each table.
2. few minutes (up to 10)
3. about 6000
4. yes
5. I do not remember the problem with accumulo 1.4.

On 1/10/15, Christopher <ct...@apache.org> wrote:
> Adding a new tserver creates an imbalanced situation, where tablets are not
> spread evenly across tablet servers. The tablet balancer in the master
> server occasionally rebalances tablets. During the short period of time
> those tablets are migrating, they will be temporarily offline. That should
> have always been the case and would be perfectly normal.
>
> I'm curious:
> 1) how did you detect these were offline?
> 2) how long were they offline?
> 3) how many tablets do you have per server?
> 4) are you using the default balancers?
> 5) in what sense do you mean "regression"? are you thinking this is linked
> to a previous bug/issue?
>
>
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
> On Sat, Jan 10, 2015 at 10:31 AM, Denis <de...@camfex.cz> wrote:
>
>> Hi
>>
>> I recently upgraded my Accumulo cluster from 1.4 to 1.6 and noticed a
>> regression.
>>
>> Removing a tserver makes puts some tablets offline for a while until
>> other tservers start handling them, that's normal.
>>
>> But with 1.6 the same happens on adding a tserver as well.
>> Is it ok?
>>
>

Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)

Posted by Christopher <ct...@apache.org>.
Adding a new tserver creates an imbalanced situation, where tablets are not
spread evenly across tablet servers. The tablet balancer in the master
server occasionally rebalances tablets. During the short period of time
those tablets are migrating, they will be temporarily offline. That should
have always been the case and would be perfectly normal.

I'm curious:
1) how did you detect these were offline?
2) how long were they offline?
3) how many tablets do you have per server?
4) are you using the default balancers?
5) in what sense do you mean "regression"? are you thinking this is linked
to a previous bug/issue?



--
Christopher L Tubbs II
http://gravatar.com/ctubbsii

On Sat, Jan 10, 2015 at 10:31 AM, Denis <de...@camfex.cz> wrote:

> Hi
>
> I recently upgraded my Accumulo cluster from 1.4 to 1.6 and noticed a
> regression.
>
> Removing a tserver makes puts some tablets offline for a while until
> other tservers start handling them, that's normal.
>
> But with 1.6 the same happens on adding a tserver as well.
> Is it ok?
>