You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "feng xu (JIRA)" <ji...@apache.org> on 2011/07/14 10:40:00 UTC

[jira] [Created] (HBASE-4094) improve hbck tool to fix more hbase problem

improve hbck tool to fix more hbase problem
-------------------------------------------

                 Key: HBASE-4094
                 URL: https://issues.apache.org/jira/browse/HBASE-4094
             Project: HBase
          Issue Type: New Feature
          Components: master
    Affects Versions: 0.90.3
            Reporter: feng xu
             Fix For: 0.90.5




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "feng xu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

feng xu updated HBASE-4094:
---------------------------

    Attachment: HBaseFsck.patch

check the table key chain just bese on the META info, if the region no deployed on any regionserver,we can delete it from META by hbase shell, so it will make a hole in chain,we can read regioninfo from hdfs or make a new region to fix the hole.

> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>             Fix For: 0.90.5
>
>         Attachments: HBaseFsck.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "feng xu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

feng xu updated HBASE-4094:
---------------------------

    Attachment:     (was: HBaseFsck.patch)

> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>             Fix For: 0.90.5
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "Jieshan Bean (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067607#comment-13067607 ] 

Jieshan Bean commented on HBASE-4094:
-------------------------------------

There's so many failure-scenarios listed above. Most of those make sense to me.
I have one question about the patch, is it just fix one of those scenarios??

> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>             Fix For: 0.90.5
>
>         Attachments: HbaseFsck_TableChain.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "Jonathan Hsieh (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Hsieh resolved HBASE-4094.
-----------------------------------

      Resolution: Duplicate
    Release Note:   (was: The hbck tool(org.apache.hadoop.hbase.util.HBaseFsck) can check and repair consistency problem.
some error just be checked but not supply the way to repair, I plan to fix it by other tool(close_region...)or by new method.
First, list it and discuss that is it right?

Part A:check meta info
1.errors.reportError(ERROR_CODE.NULL_ROOT_REGION,"Root Region or some of its attributes are null."); 
		 ------> after delete the root table,execute hbck tool to check but the tool run error. how to reproduce this error?

2.errors.reportError(ERROR_CODE.NO_META_REGION, ".META. is not found on any region.");
         ------>after delete the meta table,execute hbck tool to check but the tool run error. how to reproduce this error?
		 
3.errors.reportError(ERROR_CODE.MULTI_META_REGION, ".META. is found on more than one region.");
		 ----->the logic:scan the root table to get META table regioninfo,if META table's regions is more than one,throw the error.
					  HBase allow META table has more than one region,is it?

Part B:check Consistency
4.ERROR_CODE.NOT_IN_META_HDFS---->close it from regionserver.

5.ERROR_CODE.NOT_IN_META_OR_DEPLOYED---->do nothing,maybe it will be used to fix the chain hole in part C.

6.ERROR_CODE.NOT_IN_META---->close it from regionserver.

7.ERROR_CODE.NOT_IN_HDFS_OR_DEPLOYED---->delete it from META table,it will make a chain hole, when check chain integrity(in part C) to fix it.

8.ERROR_CODE.NOT_IN_HDFS---->delete it from META table and close it from regionserver,when check chain integrity(in part C) to fix it.

9.ERROR_CODE.NOT_DEPLOYED---->assign it.

10.ERROR_CODE.SHOULD_NOT_BE_DEPLOYED---->delete if from META table and close it from regionserver.

11.ERROR_CODE.MULTI_DEPLOYED--->close all from regionservers,and reassign it.

12.ERROR_CODE.SERVER_DOES_NOT_MATCH_META---->close all from regionservers,and reassign it.

Part C:check chain Integrity
13.ERROR_CODE.FIRST_REGION_STARTKEY_NOT_EMPTY--->treat it as a hole problem(ERROR_CODE.HOLE_IN_REGION_CHAIN).

14.ERROR_CODE.LAST_REGION_ENDKEY_NOT_EMPTY(new add)--->treat it as a hole problem(ERROR_CODE.HOLE_IN_REGION_CHAIN).

15.ERROR_CODE.REGION_CYCLE---->shut down cluster and merge two region by merge tool(org.apache.hadoop.hbase.util.Merge)

16.ERROR_CODE.DUPE_STARTKEYS--->shut down cluster and merge two region by merge tool(org.apache.hadoop.hbase.util.Merge)

17.ERROR_CODE.OVERLAP_IN_REGION_CHAIN--->shut down cluster and merge two region by merge tool(org.apache.hadoop.hbase.util.Merge)

18.ERROR_CODE.HOLE_IN_REGION_CHAIN--->write a new method to fix it,the logic is:for recover the data,collect the regionfo from regionserver and hdfs.if a region's key range is overlaping with the hole range,put it in META table and assign it,maybe it will create overlapping problem,we can fix it by merge tool.if no region be collected,create a new region by the hole key range to fix it.)

Changed to duplicate.
                
> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>             Fix For: 0.90.7
>
>         Attachments: HbaseFsck_TableChain.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "ramkrishna.s.vasudevan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-4094:
------------------------------------------

    Fix Version/s:     (was: 0.90.6)
                   0.90.7

Moving to 0.90.7.  HBASE-5128 also is related to improving hbck tool.
                
> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>             Fix For: 0.90.7
>
>         Attachments: HbaseFsck_TableChain.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "feng xu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068098#comment-13068098 ] 

feng xu commented on HBASE-4094:
--------------------------------

some fail-scenarios(ERROR_CODE.MULTI_DEPLOYED,ERROR_CODE.NOT_DEPLOYED..) have been fixed by hbck tool.
but many fail-scenarios the hbck do not supply the method to fix,like hole problem.
 
this patch(HbaseFsck_TableChain.patch) will check the table chain hole in META,and next I plan to fix hole problem.

>>18.ERROR_CODE.HOLE_IN_REGION_CHAIN--->write a new method to fix it,the logic is:for recover the data,collect the >>regionfo from regionserver and hdfs.if a region's key range is overlaping with the hole range,put it in META table and >>assign it,maybe it will create overlapping problem,we can fix it by merge tool.if no region be collected,create a new >>region by the hole key range to fix it. 


> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>             Fix For: 0.90.5
>
>         Attachments: HbaseFsck_TableChain.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "Anoop Sam John (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253125#comment-13253125 ] 

Anoop Sam John commented on HBASE-4094:
---------------------------------------

This may be closed duplicate as HBASE-5128 handles these valid scenarios now.

{quote}
14.ERROR_CODE.LAST_REGION_ENDKEY_NOT_EMPTY(new add)--->treat it as a hole problem(ERROR_CODE.HOLE_IN_REGION_CHAIN). 
{quote}
This check is not there now. But there is another issue HBASE-4379 on this.
                
> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>             Fix For: 0.90.7
>
>         Attachments: HbaseFsck_TableChain.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Reopened] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "Jonathan Hsieh (Reopened) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Hsieh reopened HBASE-4094:
-----------------------------------

    
> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>             Fix For: 0.90.7
>
>         Attachments: HbaseFsck_TableChain.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "feng xu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

feng xu updated HBASE-4094:
---------------------------

    Status: Open  (was: Patch Available)

> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>             Fix For: 0.90.5
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "feng xu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

feng xu updated HBASE-4094:
---------------------------

    Attachment: HbaseFsck_TableChain.patch

> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>             Fix For: 0.90.5
>
>         Attachments: HbaseFsck_TableChain.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "Jonathan Hsieh (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Hsieh updated HBASE-4094:
----------------------------------

    Fix Version/s:     (was: 0.90.7)

Cleaned up jira to follow convention.  Marked as duplicate of HBASE-5128
                
> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>         Attachments: HbaseFsck_TableChain.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> The hbck tool(org.apache.hadoop.hbase.util.HBaseFsck) can check and repair consistency problem.
> some error just be checked but not supply the way to repair, I plan to fix it by other tool(close_region...)or by new method.
> First, list it and discuss that is it right?
> Part A:check meta info
> 1.errors.reportError(ERROR_CODE.NULL_ROOT_REGION,"Root Region or some of its attributes are null."); 
> 		 ------> after delete the root table,execute hbck tool to check but the tool run error. how to reproduce this error?
> 2.errors.reportError(ERROR_CODE.NO_META_REGION, ".META. is not found on any region.");
>          ------>after delete the meta table,execute hbck tool to check but the tool run error. how to reproduce this error?
> 		 
> 3.errors.reportError(ERROR_CODE.MULTI_META_REGION, ".META. is found on more than one region.");
> 		 ----->the logic:scan the root table to get META table regioninfo,if META table's regions is more than one,throw the error.
> 					  HBase allow META table has more than one region,is it?
> Part B:check Consistency
> 4.ERROR_CODE.NOT_IN_META_HDFS---->close it from regionserver.
> 5.ERROR_CODE.NOT_IN_META_OR_DEPLOYED---->do nothing,maybe it will be used to fix the chain hole in part C.
> 6.ERROR_CODE.NOT_IN_META---->close it from regionserver.
> 7.ERROR_CODE.NOT_IN_HDFS_OR_DEPLOYED---->delete it from META table,it will make a chain hole, when check chain integrity(in part C) to fix it.
> 8.ERROR_CODE.NOT_IN_HDFS---->delete it from META table and close it from regionserver,when check chain integrity(in part C) to fix it.
> 9.ERROR_CODE.NOT_DEPLOYED---->assign it.
> 10.ERROR_CODE.SHOULD_NOT_BE_DEPLOYED---->delete if from META table and close it from regionserver.
> 11.ERROR_CODE.MULTI_DEPLOYED--->close all from regionservers,and reassign it.
> 12.ERROR_CODE.SERVER_DOES_NOT_MATCH_META---->close all from regionservers,and reassign it.
> Part C:check chain Integrity
> 13.ERROR_CODE.FIRST_REGION_STARTKEY_NOT_EMPTY--->treat it as a hole problem(ERROR_CODE.HOLE_IN_REGION_CHAIN).
> 14.ERROR_CODE.LAST_REGION_ENDKEY_NOT_EMPTY(new add)--->treat it as a hole problem(ERROR_CODE.HOLE_IN_REGION_CHAIN).
> 15.ERROR_CODE.REGION_CYCLE---->shut down cluster and merge two region by merge tool(org.apache.hadoop.hbase.util.Merge)
> 16.ERROR_CODE.DUPE_STARTKEYS--->shut down cluster and merge two region by merge tool(org.apache.hadoop.hbase.util.Merge)
> 17.ERROR_CODE.OVERLAP_IN_REGION_CHAIN--->shut down cluster and merge two region by merge tool(org.apache.hadoop.hbase.util.Merge)
> 18.ERROR_CODE.HOLE_IN_REGION_CHAIN--->write a new method to fix it,the logic is:for recover the data,collect the regionfo from regionserver and hdfs.if a region's key range is overlaping with the hole range,put it in META table and assign it,maybe it will create overlapping problem,we can fix it by merge tool.if no region be collected,create a new region by the hole key range to fix it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "Jonathan Hsieh (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Hsieh updated HBASE-4094:
----------------------------------

    Description: 
The hbck tool(org.apache.hadoop.hbase.util.HBaseFsck) can check and repair consistency problem.
some error just be checked but not supply the way to repair, I plan to fix it by other tool(close_region...)or by new method.
First, list it and discuss that is it right?

Part A:check meta info
1.errors.reportError(ERROR_CODE.NULL_ROOT_REGION,"Root Region or some of its attributes are null."); 
		 ------> after delete the root table,execute hbck tool to check but the tool run error. how to reproduce this error?

2.errors.reportError(ERROR_CODE.NO_META_REGION, ".META. is not found on any region.");
         ------>after delete the meta table,execute hbck tool to check but the tool run error. how to reproduce this error?
		 
3.errors.reportError(ERROR_CODE.MULTI_META_REGION, ".META. is found on more than one region.");
		 ----->the logic:scan the root table to get META table regioninfo,if META table's regions is more than one,throw the error.
					  HBase allow META table has more than one region,is it?

Part B:check Consistency
4.ERROR_CODE.NOT_IN_META_HDFS---->close it from regionserver.

5.ERROR_CODE.NOT_IN_META_OR_DEPLOYED---->do nothing,maybe it will be used to fix the chain hole in part C.

6.ERROR_CODE.NOT_IN_META---->close it from regionserver.

7.ERROR_CODE.NOT_IN_HDFS_OR_DEPLOYED---->delete it from META table,it will make a chain hole, when check chain integrity(in part C) to fix it.

8.ERROR_CODE.NOT_IN_HDFS---->delete it from META table and close it from regionserver,when check chain integrity(in part C) to fix it.

9.ERROR_CODE.NOT_DEPLOYED---->assign it.

10.ERROR_CODE.SHOULD_NOT_BE_DEPLOYED---->delete if from META table and close it from regionserver.

11.ERROR_CODE.MULTI_DEPLOYED--->close all from regionservers,and reassign it.

12.ERROR_CODE.SERVER_DOES_NOT_MATCH_META---->close all from regionservers,and reassign it.

Part C:check chain Integrity
13.ERROR_CODE.FIRST_REGION_STARTKEY_NOT_EMPTY--->treat it as a hole problem(ERROR_CODE.HOLE_IN_REGION_CHAIN).

14.ERROR_CODE.LAST_REGION_ENDKEY_NOT_EMPTY(new add)--->treat it as a hole problem(ERROR_CODE.HOLE_IN_REGION_CHAIN).

15.ERROR_CODE.REGION_CYCLE---->shut down cluster and merge two region by merge tool(org.apache.hadoop.hbase.util.Merge)

16.ERROR_CODE.DUPE_STARTKEYS--->shut down cluster and merge two region by merge tool(org.apache.hadoop.hbase.util.Merge)

17.ERROR_CODE.OVERLAP_IN_REGION_CHAIN--->shut down cluster and merge two region by merge tool(org.apache.hadoop.hbase.util.Merge)

18.ERROR_CODE.HOLE_IN_REGION_CHAIN--->write a new method to fix it,the logic is:for recover the data,collect the regionfo from regionserver and hdfs.if a region's key range is overlaping with the hole range,put it in META table and assign it,maybe it will create overlapping problem,we can fix it by merge tool.if no region be collected,create a new region by the hole key range to fix it.
    
> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>             Fix For: 0.90.7
>
>         Attachments: HbaseFsck_TableChain.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> The hbck tool(org.apache.hadoop.hbase.util.HBaseFsck) can check and repair consistency problem.
> some error just be checked but not supply the way to repair, I plan to fix it by other tool(close_region...)or by new method.
> First, list it and discuss that is it right?
> Part A:check meta info
> 1.errors.reportError(ERROR_CODE.NULL_ROOT_REGION,"Root Region or some of its attributes are null."); 
> 		 ------> after delete the root table,execute hbck tool to check but the tool run error. how to reproduce this error?
> 2.errors.reportError(ERROR_CODE.NO_META_REGION, ".META. is not found on any region.");
>          ------>after delete the meta table,execute hbck tool to check but the tool run error. how to reproduce this error?
> 		 
> 3.errors.reportError(ERROR_CODE.MULTI_META_REGION, ".META. is found on more than one region.");
> 		 ----->the logic:scan the root table to get META table regioninfo,if META table's regions is more than one,throw the error.
> 					  HBase allow META table has more than one region,is it?
> Part B:check Consistency
> 4.ERROR_CODE.NOT_IN_META_HDFS---->close it from regionserver.
> 5.ERROR_CODE.NOT_IN_META_OR_DEPLOYED---->do nothing,maybe it will be used to fix the chain hole in part C.
> 6.ERROR_CODE.NOT_IN_META---->close it from regionserver.
> 7.ERROR_CODE.NOT_IN_HDFS_OR_DEPLOYED---->delete it from META table,it will make a chain hole, when check chain integrity(in part C) to fix it.
> 8.ERROR_CODE.NOT_IN_HDFS---->delete it from META table and close it from regionserver,when check chain integrity(in part C) to fix it.
> 9.ERROR_CODE.NOT_DEPLOYED---->assign it.
> 10.ERROR_CODE.SHOULD_NOT_BE_DEPLOYED---->delete if from META table and close it from regionserver.
> 11.ERROR_CODE.MULTI_DEPLOYED--->close all from regionservers,and reassign it.
> 12.ERROR_CODE.SERVER_DOES_NOT_MATCH_META---->close all from regionservers,and reassign it.
> Part C:check chain Integrity
> 13.ERROR_CODE.FIRST_REGION_STARTKEY_NOT_EMPTY--->treat it as a hole problem(ERROR_CODE.HOLE_IN_REGION_CHAIN).
> 14.ERROR_CODE.LAST_REGION_ENDKEY_NOT_EMPTY(new add)--->treat it as a hole problem(ERROR_CODE.HOLE_IN_REGION_CHAIN).
> 15.ERROR_CODE.REGION_CYCLE---->shut down cluster and merge two region by merge tool(org.apache.hadoop.hbase.util.Merge)
> 16.ERROR_CODE.DUPE_STARTKEYS--->shut down cluster and merge two region by merge tool(org.apache.hadoop.hbase.util.Merge)
> 17.ERROR_CODE.OVERLAP_IN_REGION_CHAIN--->shut down cluster and merge two region by merge tool(org.apache.hadoop.hbase.util.Merge)
> 18.ERROR_CODE.HOLE_IN_REGION_CHAIN--->write a new method to fix it,the logic is:for recover the data,collect the regionfo from regionserver and hdfs.if a region's key range is overlaping with the hole range,put it in META table and assign it,maybe it will create overlapping problem,we can fix it by merge tool.if no region be collected,create a new region by the hole key range to fix it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "feng xu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

feng xu updated HBASE-4094:
---------------------------

    Status: Patch Available  (was: Open)

> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>             Fix For: 0.90.5
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068699#comment-13068699 ] 

stack commented on HBASE-4094:
------------------------------

@Feng Thank you for digging in on this important issue.  As Jieshan asks, are you going to fix all the issues not already addressed by --fix listed above?  If so, do you think it best to do it one per issue (if you'd rather piecemeal it) or do you want to make a monster patch to do them all in this issue?

I looked at your patch and it seems to comment out a line of code and add a comment.

Regards 18 above, when you say collect the regioninfo from regionserver and hdfs, what do you mean?  What if the region is not on a regionserver?  If its in the filesystem, how will you find it?  You only have the gap in the table and from here you need to get to the encoded name of the region in the filesystem.  Are you thinking of looking at all the regions in the filesystem and getting all of their .regioninfos and then checking for which has a start and stop key that matches the hole?

If this fails, yes, create a new region to bridge the hole.

Do you think this issue has overlap with HBASE-4058 Feng?

Thanks.


> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>             Fix For: 0.90.5
>
>         Attachments: HbaseFsck_TableChain.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "feng xu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068899#comment-13068899 ] 

feng xu commented on HBASE-4094:
--------------------------------

>>Regards 18 above, when you say collect the regioninfo from regionserver and hdfs, what do you mean? 
the regions that in hole maybe be deployed on the reginserver,if not,maybe in hdfs. just to find them to fix the hole,it can recover the hole data.
 
>>What if the region is not on a regionserver? If its in the filesystem, how will you find it? 
the hbck tool will check the regionserver(WorkItemRegion()) and hdfs(WorkItemHdfsDir()),if the region that from regionserver or hdfs, store it and 
sign it where from by enum INFO_FROM. when check table chain,we can reference it to fix the hole.

>>Are you thinking of looking at all the regions in the filesystem and getting all of their .regioninfos 
yes, but just get the regions that not signed in META table from regionserver or filesystem .

>>then checking for which has a start and stop key that matches the hole?
the hole maybe need some regions to fix, in my patch ,if the region from regionserver or filesystem that the key range is overlapping with the hole,
I will use it to fix the hole, I also know it will make the overlapping problem in META table,but it can recover hole data,we can fix the overlapping problem
by merge tool,is it right?

>>Do you think this issue has overlap with HBASE-4058 Feng?
yes, this issue is also relate with the hbck tool to fix the cluster problem.


I have filed another issue HBASE-4122 which is about how to fix the chain hole problem and submitted a patch. 

> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>             Fix For: 0.90.5
>
>         Attachments: HbaseFsck_TableChain.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-4094) improve hbck tool to fix more hbase problem

Posted by "stack (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-4094.
--------------------------

    Resolution: Fixed

Resolving at Anoop's suggestion as dup of hbase-5128
                
> improve hbck tool to fix more hbase problem
> -------------------------------------------
>
>                 Key: HBASE-4094
>                 URL: https://issues.apache.org/jira/browse/HBASE-4094
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: feng xu
>             Fix For: 0.90.7
>
>         Attachments: HbaseFsck_TableChain.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira