You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Apekshit Sharma <ap...@cloudera.com> on 2017/12/13 09:14:58 UTC

Can we always add new regions as CLOSED?

Hi folks,

Was debugging TestTruncateTableProcedure when starting thinking about this.
(That's one mean test! What nice fault tolerant tests!)

So the specific case: If we fail after adding new regions to meta (
TRUNCATE_TABLE_ADD_TO_META
<https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/TruncateTableProcedure.java#L127>),
then on recovery, AM assumes those regions with null state as offline and
begins assigning them by itself which is wrong since truncate action is not
complete (and it'll try to assign them too on recovery, and there are locks
to avoid simultaneous assigns etc.)
Simple fix is, add regions with initial state as CLOSED.

Then looking in other places, CreateTableProcedure seems like it should
suffer the same fate (CREATE_TABLE_ADD_TO_META
<https://github.com/apache/hbase/blob/677c1f2c635273eb823b91903dffdb2e587f5181/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateTableProcedure.java#L104>).
Should we add region as CLOSED there too?
(Weird part is, it's not failing, looking into it)

So the main question is, shouldn't we always add new regions to meta with
state as CLOSED?

Whatever operation is adding them will also be opening them if needed,
right? And no operation should be relying on this weird AM assumption to
complete it's half done job.

Food for thought - Some operations adding regions are: truncate table,
create table, modify table, clone snapshot, restore snapshot.

Can you imagine a case where not adding a new region as CLOSED makes sense?

-- Appy

Re: Can we always add new regions as CLOSED?

Posted by Stack <st...@duboce.net>.
On Wed, Dec 13, 2017 at 1:14 AM, Apekshit Sharma <ap...@cloudera.com> wrote:

> Hi folks,
>
> Was debugging TestTruncateTableProcedure when starting thinking about this.
> (That's one mean test! What nice fault tolerant tests!)
>
> So the specific case: If we fail after adding new regions to meta (
> TRUNCATE_TABLE_ADD_TO_META
> <https://github.com/apache/hbase/blob/master/hbase-
> server/src/main/java/org/apache/hadoop/hbase/master/procedure/
> TruncateTableProcedure.java#L127>),
> then on recovery, AM assumes those regions with null state as offline and
> begins assigning them by itself which is wrong since truncate action is not
> complete (and it'll try to assign them too on recovery, and there are locks
> to avoid simultaneous assigns etc.)
> Simple fix is, add regions with initial state as CLOSED.
>
> Then looking in other places, CreateTableProcedure seems like it should
> suffer the same fate (CREATE_TABLE_ADD_TO_META
> <https://github.com/apache/hbase/blob/677c1f2c635273eb823b91903dffdb
> 2e587f5181/hbase-server/src/main/java/org/apache/hadoop/
> hbase/master/procedure/CreateTableProcedure.java#L104>).
> Should we add region as CLOSED there too?
> (Weird part is, it's not failing, looking into it)
>
> So the main question is, shouldn't we always add new regions to meta with
> state as CLOSED?
>
>
I've been here recently. OFFLINE?



> Whatever operation is adding them will also be opening them if needed,
> right? And no operation should be relying on this weird AM assumption to
> complete it's half done job.
>
> Food for thought - Some operations adding regions are: truncate table,
> create table, modify table, clone snapshot, restore snapshot.
>
> Can you imagine a case where not adding a new region as CLOSED makes sense?
>
>
None. There is adding and then AMv2 takes control.

These tests that double-kill to ensure each step recoverable are great,
yeah, at finding dirty bugs.

I think truncate table though a little silly. If you proposed killing it,
I'd +1 it.

St.Ack


> -- Appy
>