You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Tao Xiao <xi...@gmail.com> on 2014/10/14 05:13:56 UTC

Four regions were lost and how to recover ?

I have a table named *E_MP_DAY_READ_201409 *and it has 171 regions - I can
check that through the command "hadoop dfs -ls
/hbase/data/default/E_MP_DAY_READ_201409", which echoed the following:

[root@a05 /]# hadoop dfs -ls /hbase/data/default/E_MP_DAY_READ_201409
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Found 173 items
drwxr-xr-x   - hbase hbase          0 2014-08-15 09:13
/hbase/data/default/E_MP_DAY_READ_201409/.tabledesc
drwxr-xr-x   - hbase hbase          0 2014-08-15 09:13
/hbase/data/default/E_MP_DAY_READ_201409/.tmp
drwxr-xr-x   - hbase hbase          0 2014-10-13 15:36
/hbase/data/default/E_MP_DAY_READ_201409/0139c0c46d7bfd24b02bc8e4d09edcbc
drwxr-xr-x   - hbase hbase          0 2014-10-13 15:36
/hbase/data/default/E_MP_DAY_READ_201409/015e17f2279ac32b965f12e1827ac27a
drwxr-xr-x   - hbase hbase          0 2014-10-11 12:10
/hbase/data/default/E_MP_DAY_READ_201409/03cb9aaf122b1f1946e5cc82f822fcbf
drwxr-xr-x   - hbase hbase          0 2014-10-10 12:36
/hbase/data/default/E_MP_DAY_READ_201409/03f43abf8a145d616301ebb21f8b3d5d
drwxr-xr-x   - hbase hbase          0 2014-10-09 18:52
/hbase/data/default/E_MP_DAY_READ_201409/05f2727ef3443b22f1669972f3674e3b
... ... ...


But when I checked the status and regions of that table through HBase UI,
some regions were not listed - HBase UI told me that table has only 167
regions (see this screenshot
<http://imgbin.org/index.php?page=image&id=20236>).  4 regions are lost or
are not managed by HBase.

Besides, I checked the start/end keys for that table. Normally, the start
key of the first region and end key of the last region should both be an
empty string.  However for table *E_MP_DAY_READ_201409*, the end key of the
last region is "40117135_20140906" (see this screenshot
<http://imgbin.org/index.php?page=image&id=20237>).

I also checked HBase Master log, there are some logs about "
*E_MP_DAY_READ_201409"*, for example:

2014-10-09 18:22:16,983 INFO
org.apache.hadoop.hbase.master.AssignmentManager: Assigning
E_MP_DAY_READ_201409,110000702209_20140902,1411697475831.fd872b265b1de14ac0483d1e0833df9c.
to b01.jsepc.com,60020,1412730931587
2014-10-09 18:22:17,005 INFO
org.apache.hadoop.hbase.master.AssignmentManager: Assigning
E_MP_DAY_READ_201409,110003028425_20140901,1412730964508.9f4889deefffe5b29c3a797177fdd546.
to b01.jsepc.com,60020,1412730931587
2014-10-09 18:22:17,049 INFO
org.apache.hadoop.hbase.master.AssignmentManager: Assigning
E_MP_DAY_READ_201409,120000596228,1411785910190.bbbefc6de54fb6fa6f0c6747b2e21df4.
to a02.jsepc.com,60020,1412730931720
2014-10-09 18:22:17,078 INFO
org.apache.hadoop.hbase.master.AssignmentManager: Assigning
E_MP_DAY_READ_201409,130003536756_20140910,1412060002495.4d747cf964cf0aaf3ce88bf6ed57d856.
to b04.jsepc.com,60020,1412730931859
2014-10-09 18:22:17,095 INFO
org.apache.hadoop.hbase.master.AssignmentManager: Assigning
E_MP_DAY_READ_201409,120003489510_20140924,1412730963561.182f466d8c43fe28e81fabe6fed6a117.
to a09.jsepc.com,60020,1412730927959
2014-10-09 18:22:17,109 INFO org.apache.hadoop.hbase.master.RegionStates:
Onlined ae543aae3d2fc5af3d300e2961a070e1 on b01.jsepc.com
,60020,1412730931587
2014-10-09 18:22:17,111 INFO
org.apache.hadoop.hbase.master.AssignmentManager: Assigning
E_MP_DAY_READ_201409,120002612313_20140905,1410841287079.a44dcd40eb7630f4fb7068a2f0e85c7d.
to b05.jsepc.com,60020,1412730931854
2014-10-09 18:22:17,111 INFO
org.apache.hadoop.hbase.master.AssignmentManager: Assigning
E_MP_DAY_READ_201409,120003646179_20140901,1411020938380.f40a6a7a8a43c917129b4bfd6aa54849.
to b05.jsepc.com,60020,1412730931854


2014-10-09 18:22:19,278 INFO
org.apache.hadoop.hbase.master.AssignmentManager: Handled SPLIT event;
parent=E_MP_DAY_READ_201409,120001002849_20140908,1411888886958.b1ac85daaa28425d84eee722c19d96ee.,
daughter
a=E_MP_DAY_READ_201409,120001002849_20140908,1412850138530.5df33c2d0835267b54e727b16f4dca7f.,
daughter
b=E_MP_DAY_READ_201409,120001207828_20140910,1412850138530.43523468dde2c59ea3f5185c7b708996.,
on b07.jsepc.com,60020,1412730931454
2014-10-09 18:25:59,382 INFO org.apache.hadoop.hbase.catalog.MetaEditor:
Deleted
E_MP_DAY_READ_201409,120001002849_20140908,1411888886958.b1ac85daaa28425d84eee722c19d96ee.
2014-10-09 18:25:59,506 INFO org.apache.hadoop.hbase.catalog.MetaEditor:
Deleted
E_MP_DAY_READ_201409,140000443741_20140905,1412060002495.9c0de96f038feca5f3044edd66ca14d5.



I can not see what caused the loss of 4 regions.
Does anybody know the reason and how to recover the table ?


Thanks

Re: Four regions were lost and how to recover ?

Posted by Tao Xiao <xi...@gmail.com>.
The HBase version is 0.96.1.1.

I searched the master logs using command "cat <master.log> | grep
E_MP_DAY_READ_201409 | grep Error" but did not find any errors w.r.t
regions of E_MP_DAY_READ_201409.

My collegue triggerred split and compact for that table before I run "hbase
hbck E_MP_DAY_READ_201409", and it reported that the table was consistent.

After the split and compact, I found the number of regions shown in HBase
UI is the same with the result of "dfs -ls /hbase/data/default/E_MP_DAY_
READ_201409". It seems that table's lost regions are all recovered after
split and compact.

Re: Four regions were lost and how to recover ?

Posted by Ted Yu <yu...@gmail.com>.
Which hbase release are you using ?

Is there any error in master log w.r.t. regions of E_MP_DAY_READ_201409 ?

Have you run hbck ?

Cheers

On Mon, Oct 13, 2014 at 8:13 PM, Tao Xiao <xi...@gmail.com> wrote:

> I have a table named *E_MP_DAY_READ_201409 *and it has 171 regions - I can
> check that through the command "hadoop dfs -ls
> /hbase/data/default/E_MP_DAY_READ_201409", which echoed the following:
>
> [root@a05 /]# hadoop dfs -ls /hbase/data/default/E_MP_DAY_READ_201409
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
>
> Found 173 items
> drwxr-xr-x   - hbase hbase          0 2014-08-15 09:13
> /hbase/data/default/E_MP_DAY_READ_201409/.tabledesc
> drwxr-xr-x   - hbase hbase          0 2014-08-15 09:13
> /hbase/data/default/E_MP_DAY_READ_201409/.tmp
> drwxr-xr-x   - hbase hbase          0 2014-10-13 15:36
> /hbase/data/default/E_MP_DAY_READ_201409/0139c0c46d7bfd24b02bc8e4d09edcbc
> drwxr-xr-x   - hbase hbase          0 2014-10-13 15:36
> /hbase/data/default/E_MP_DAY_READ_201409/015e17f2279ac32b965f12e1827ac27a
> drwxr-xr-x   - hbase hbase          0 2014-10-11 12:10
> /hbase/data/default/E_MP_DAY_READ_201409/03cb9aaf122b1f1946e5cc82f822fcbf
> drwxr-xr-x   - hbase hbase          0 2014-10-10 12:36
> /hbase/data/default/E_MP_DAY_READ_201409/03f43abf8a145d616301ebb21f8b3d5d
> drwxr-xr-x   - hbase hbase          0 2014-10-09 18:52
> /hbase/data/default/E_MP_DAY_READ_201409/05f2727ef3443b22f1669972f3674e3b
> ... ... ...
>
>
> But when I checked the status and regions of that table through HBase UI,
> some regions were not listed - HBase UI told me that table has only 167
> regions (see this screenshot
> <http://imgbin.org/index.php?page=image&id=20236>).  4 regions are lost or
> are not managed by HBase.
>
> Besides, I checked the start/end keys for that table. Normally, the start
> key of the first region and end key of the last region should both be an
> empty string.  However for table *E_MP_DAY_READ_201409*, the end key of the
> last region is "40117135_20140906" (see this screenshot
> <http://imgbin.org/index.php?page=image&id=20237>).
>
> I also checked HBase Master log, there are some logs about "
> *E_MP_DAY_READ_201409"*, for example:
>
> 2014-10-09 18:22:16,983 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>
> E_MP_DAY_READ_201409,110000702209_20140902,1411697475831.fd872b265b1de14ac0483d1e0833df9c.
> to b01.jsepc.com,60020,1412730931587
> 2014-10-09 18:22:17,005 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>
> E_MP_DAY_READ_201409,110003028425_20140901,1412730964508.9f4889deefffe5b29c3a797177fdd546.
> to b01.jsepc.com,60020,1412730931587
> 2014-10-09 18:22:17,049 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>
> E_MP_DAY_READ_201409,120000596228,1411785910190.bbbefc6de54fb6fa6f0c6747b2e21df4.
> to a02.jsepc.com,60020,1412730931720
> 2014-10-09 18:22:17,078 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>
> E_MP_DAY_READ_201409,130003536756_20140910,1412060002495.4d747cf964cf0aaf3ce88bf6ed57d856.
> to b04.jsepc.com,60020,1412730931859
> 2014-10-09 18:22:17,095 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>
> E_MP_DAY_READ_201409,120003489510_20140924,1412730963561.182f466d8c43fe28e81fabe6fed6a117.
> to a09.jsepc.com,60020,1412730927959
> 2014-10-09 18:22:17,109 INFO org.apache.hadoop.hbase.master.RegionStates:
> Onlined ae543aae3d2fc5af3d300e2961a070e1 on b01.jsepc.com
> ,60020,1412730931587
> 2014-10-09 18:22:17,111 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>
> E_MP_DAY_READ_201409,120002612313_20140905,1410841287079.a44dcd40eb7630f4fb7068a2f0e85c7d.
> to b05.jsepc.com,60020,1412730931854
> 2014-10-09 18:22:17,111 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>
> E_MP_DAY_READ_201409,120003646179_20140901,1411020938380.f40a6a7a8a43c917129b4bfd6aa54849.
> to b05.jsepc.com,60020,1412730931854
>
>
> 2014-10-09 18:22:19,278 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Handled SPLIT event;
>
> parent=E_MP_DAY_READ_201409,120001002849_20140908,1411888886958.b1ac85daaa28425d84eee722c19d96ee.,
> daughter
>
> a=E_MP_DAY_READ_201409,120001002849_20140908,1412850138530.5df33c2d0835267b54e727b16f4dca7f.,
> daughter
>
> b=E_MP_DAY_READ_201409,120001207828_20140910,1412850138530.43523468dde2c59ea3f5185c7b708996.,
> on b07.jsepc.com,60020,1412730931454
> 2014-10-09 18:25:59,382 INFO org.apache.hadoop.hbase.catalog.MetaEditor:
> Deleted
>
> E_MP_DAY_READ_201409,120001002849_20140908,1411888886958.b1ac85daaa28425d84eee722c19d96ee.
> 2014-10-09 18:25:59,506 INFO org.apache.hadoop.hbase.catalog.MetaEditor:
> Deleted
>
> E_MP_DAY_READ_201409,140000443741_20140905,1412060002495.9c0de96f038feca5f3044edd66ca14d5.
>
>
>
> I can not see what caused the loss of 4 regions.
> Does anybody know the reason and how to recover the table ?
>
>
> Thanks
>

Re: Four regions were lost and how to recover ?

Posted by Stack <st...@duboce.net>.
Listing in HDFS does not always correlate with the list of regions HBase
shows in the UI.  On split, the parent regions may stick around a while
until all references to them are undone.
St.Ack

On Mon, Oct 13, 2014 at 8:13 PM, Tao Xiao <xi...@gmail.com> wrote:

> I have a table named *E_MP_DAY_READ_201409 *and it has 171 regions - I can
> check that through the command "hadoop dfs -ls
> /hbase/data/default/E_MP_DAY_READ_201409", which echoed the following:
>
> [root@a05 /]# hadoop dfs -ls /hbase/data/default/E_MP_DAY_READ_201409
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
>
> Found 173 items
> drwxr-xr-x   - hbase hbase          0 2014-08-15 09:13
> /hbase/data/default/E_MP_DAY_READ_201409/.tabledesc
> drwxr-xr-x   - hbase hbase          0 2014-08-15 09:13
> /hbase/data/default/E_MP_DAY_READ_201409/.tmp
> drwxr-xr-x   - hbase hbase          0 2014-10-13 15:36
> /hbase/data/default/E_MP_DAY_READ_201409/0139c0c46d7bfd24b02bc8e4d09edcbc
> drwxr-xr-x   - hbase hbase          0 2014-10-13 15:36
> /hbase/data/default/E_MP_DAY_READ_201409/015e17f2279ac32b965f12e1827ac27a
> drwxr-xr-x   - hbase hbase          0 2014-10-11 12:10
> /hbase/data/default/E_MP_DAY_READ_201409/03cb9aaf122b1f1946e5cc82f822fcbf
> drwxr-xr-x   - hbase hbase          0 2014-10-10 12:36
> /hbase/data/default/E_MP_DAY_READ_201409/03f43abf8a145d616301ebb21f8b3d5d
> drwxr-xr-x   - hbase hbase          0 2014-10-09 18:52
> /hbase/data/default/E_MP_DAY_READ_201409/05f2727ef3443b22f1669972f3674e3b
> ... ... ...
>
>
> But when I checked the status and regions of that table through HBase UI,
> some regions were not listed - HBase UI told me that table has only 167
> regions (see this screenshot
> <http://imgbin.org/index.php?page=image&id=20236>).  4 regions are lost or
> are not managed by HBase.
>
> Besides, I checked the start/end keys for that table. Normally, the start
> key of the first region and end key of the last region should both be an
> empty string.  However for table *E_MP_DAY_READ_201409*, the end key of the
> last region is "40117135_20140906" (see this screenshot
> <http://imgbin.org/index.php?page=image&id=20237>).
>
> I also checked HBase Master log, there are some logs about "
> *E_MP_DAY_READ_201409"*, for example:
>
> 2014-10-09 18:22:16,983 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>
> E_MP_DAY_READ_201409,110000702209_20140902,1411697475831.fd872b265b1de14ac0483d1e0833df9c.
> to b01.jsepc.com,60020,1412730931587
> 2014-10-09 18:22:17,005 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>
> E_MP_DAY_READ_201409,110003028425_20140901,1412730964508.9f4889deefffe5b29c3a797177fdd546.
> to b01.jsepc.com,60020,1412730931587
> 2014-10-09 18:22:17,049 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>
> E_MP_DAY_READ_201409,120000596228,1411785910190.bbbefc6de54fb6fa6f0c6747b2e21df4.
> to a02.jsepc.com,60020,1412730931720
> 2014-10-09 18:22:17,078 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>
> E_MP_DAY_READ_201409,130003536756_20140910,1412060002495.4d747cf964cf0aaf3ce88bf6ed57d856.
> to b04.jsepc.com,60020,1412730931859
> 2014-10-09 18:22:17,095 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>
> E_MP_DAY_READ_201409,120003489510_20140924,1412730963561.182f466d8c43fe28e81fabe6fed6a117.
> to a09.jsepc.com,60020,1412730927959
> 2014-10-09 18:22:17,109 INFO org.apache.hadoop.hbase.master.RegionStates:
> Onlined ae543aae3d2fc5af3d300e2961a070e1 on b01.jsepc.com
> ,60020,1412730931587
> 2014-10-09 18:22:17,111 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>
> E_MP_DAY_READ_201409,120002612313_20140905,1410841287079.a44dcd40eb7630f4fb7068a2f0e85c7d.
> to b05.jsepc.com,60020,1412730931854
> 2014-10-09 18:22:17,111 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>
> E_MP_DAY_READ_201409,120003646179_20140901,1411020938380.f40a6a7a8a43c917129b4bfd6aa54849.
> to b05.jsepc.com,60020,1412730931854
>
>
> 2014-10-09 18:22:19,278 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Handled SPLIT event;
>
> parent=E_MP_DAY_READ_201409,120001002849_20140908,1411888886958.b1ac85daaa28425d84eee722c19d96ee.,
> daughter
>
> a=E_MP_DAY_READ_201409,120001002849_20140908,1412850138530.5df33c2d0835267b54e727b16f4dca7f.,
> daughter
>
> b=E_MP_DAY_READ_201409,120001207828_20140910,1412850138530.43523468dde2c59ea3f5185c7b708996.,
> on b07.jsepc.com,60020,1412730931454
> 2014-10-09 18:25:59,382 INFO org.apache.hadoop.hbase.catalog.MetaEditor:
> Deleted
>
> E_MP_DAY_READ_201409,120001002849_20140908,1411888886958.b1ac85daaa28425d84eee722c19d96ee.
> 2014-10-09 18:25:59,506 INFO org.apache.hadoop.hbase.catalog.MetaEditor:
> Deleted
>
> E_MP_DAY_READ_201409,140000443741_20140905,1412060002495.9c0de96f038feca5f3044edd66ca14d5.
>
>
>
> I can not see what caused the loss of 4 regions.
> Does anybody know the reason and how to recover the table ?
>
>
> Thanks
>