You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org> on 2010/03/02 20:59:27 UTC

[jira] Commented: (HBASE-2244) META gets inconsistent in a number of crash scenarios

    [ https://issues.apache.org/jira/browse/HBASE-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840302#action_12840302 ] 

Kannan Muthukkaruppan commented on HBASE-2244:
----------------------------------------------

Stack: The changes look very good. Like all the helpful comments as well.

Some small comments on the patch:

Previously, the first time hasReferences() finds that there are no references
from a daughter region, it would delete the corresponding "info:splitA or B" row
from the parent. On subsequent calls, hasReferences would bail early because "split"
would be null.

Now, the deletion of the daughter column (info:splitA or B) in the parent row
happens in checkDaughter() (when it calls removeDaughterFromParent()). hasReferences()
will keep returning false for a daughter that no longer has references, and it seems
we'll unnecessarily be calling removeDaughterFromParent() on more than one
occasion. Delete of a non-existent column should be a no-op-- so this is probably
not a major issue, but it would be good to avoid introducing these wasteful deletes.

Minor/Cosmetic:
~~~~~~~~~~~~~~~

* HRegion.java:createDaughterDirInSplitDir()

Doesn't actually create the daughter dir, but only constructs a path
to the daughter dir, correct? Perhaps the function could be name
getSplitDirForDaughter() or something.

* typo: BaseScanner.java
 * the filesystem, then a daughters was not added __o__ .META. -- must have been

* typo: HRegion.java
      // __Crate__ a region instance and then move the splits into place under

> META gets inconsistent in a number of crash scenarios
> -----------------------------------------------------
>
>                 Key: HBASE-2244
>                 URL: https://issues.apache.org/jira/browse/HBASE-2244
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: stack
>            Priority: Critical
>             Fix For: 0.20.4
>
>         Attachments: 2244-v3.patch, 2244-v5.patch, 2244.patch
>
>
> (Forking this issue off from HBASE-2235).
> During load testing, in a number of failure scenarios (unexpected region server deaths) etc., we notice that META can get inconsistent. This primarily happens for regions which are in the process of being split. Manually running add_table.rb seems to fix the tables meta data just fine. 
> But it would be good to do automatic cleansing (as part of META scanners work) and/or avoid these inconsistent states altogether.
> For example, for a particular startkey, I see all these entries:
> {code}
> test1,1204765,1266569946560 column=info:regioninfo, timestamp=1266581302018, value=REGION => {NAME => 'test1,
>                              1204765,1266569946560', STARTKEY => '1204765', ENDKEY => '1441091', ENCODED => 18
>                              19368969, OFFLINE => true, SPLIT => true, TABLE => {{NAME => 'test1', FAMILIES =>
>                               [{NAME => 'actions', VERSIONS => '3', COMPRESSION => 'NONE', TTL => '2147483647'
>                              , BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
>  test1,1204765,1266569946560 column=info:server, timestamp=1266570029133, value=10.129.68.212:60020
>  test1,1204765,1266569946560 column=info:serverstartcode, timestamp=1266570029133, value=1266562597546
>  test1,1204765,1266569946560 column=info:splitB, timestamp=1266581302018, value=\x00\x071441091\x00\x00\x00\x0
>                              1\x26\xE6\x1F\xDF\x27\x1Btest1,1290703,1266581233447\x00\x071290703\x00\x00\x00\x
>                              05\x05test1\x00\x00\x00\x00\x00\x02\x00\x00\x00\x07IS_ROOT\x00\x00\x00\x05false\x
>                              00\x00\x00\x07IS_META\x00\x00\x00\x05false\x00\x00\x00\x01\x07\x07actions\x00\x00
>                              \x00\x07\x00\x00\x00\x0BBLOOMFILTER\x00\x00\x00\x05false\x00\x00\x00\x0BCOMPRESSI
>                              ON\x00\x00\x00\x04NONE\x00\x00\x00\x08VERSIONS\x00\x00\x00\x013\x00\x00\x00\x03TT
>                              L\x00\x00\x00\x0A2147483647\x00\x00\x00\x09BLOCKSIZE\x00\x00\x00\x0565536\x00\x00
>                              \x00\x09IN_MEMORY\x00\x00\x00\x05false\x00\x00\x00\x0ABLOCKCACHE\x00\x00\x00\x04t
>                              rueh\x0FQ\xCF
>  test1,1204765,1266581233447 column=info:regioninfo, timestamp=1266609172177, value=REGION => {NAME => 'test1,
>                              1204765,1266581233447', STARTKEY => '1204765', ENDKEY => '1290703', ENCODED => 13
>                              73493090, OFFLINE => true, SPLIT => true, TABLE => {{NAME => 'test1', FAMILIES =>
>                               [{NAME => 'actions', VERSIONS => '3', COMPRESSION => 'NONE', TTL => '2147483647'
>                              , BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
>  test1,1204765,1266581233447 column=info:server, timestamp=1266604768670, value=10.129.68.213:60020
>  test1,1204765,1266581233447 column=info:serverstartcode, timestamp=1266604768670, value=1266562597511
>  test1,1204765,1266581233447 column=info:splitA, timestamp=1266609172177, value=\x00\x071226169\x00\x00\x00\x0
>                              1\x26\xE7\xCA,\x7D\x1Btest1,1204765,1266609171581\x00\x071204765\x00\x00\x00\x05\
>                              x05test1\x00\x00\x00\x00\x00\x02\x00\x00\x00\x07IS_ROOT\x00\x00\x00\x05false\x00\
>                              x00\x00\x07IS_META\x00\x00\x00\x05false\x00\x00\x00\x01\x07\x07actions\x00\x00\x0
>                              0\x07\x00\x00\x00\x0BBLOOMFILTER\x00\x00\x00\x05false\x00\x00\x00\x0BCOMPRESSION\
>                              x00\x00\x00\x04NONE\x00\x00\x00\x08VERSIONS\x00\x00\x00\x013\x00\x00\x00\x03TTL\x
>                              00\x00\x00\x0A2147483647\x00\x00\x00\x09BLOCKSIZE\x00\x00\x00\x0565536\x00\x00\x0
>                              0\x09IN_MEMORY\x00\x00\x00\x05false\x00\x00\x00\x0ABLOCKCACHE\x00\x00\x00\x04true
>                              \xB9\xBD\xFEO
>  test1,1204765,1266581233447 column=info:splitB, timestamp=1266609172177, value=\x00\x071290703\x00\x00\x00\x0
>                              1\x26\xE7\xCA,\x7D\x1Btest1,1226169,1266609171581\x00\x071226169\x00\x00\x00\x05\
>                              x05test1\x00\x00\x00\x00\x00\x02\x00\x00\x00\x07IS_ROOT\x00\x00\x00\x05false\x00\
>                              x00\x00\x07IS_META\x00\x00\x00\x05false\x00\x00\x00\x01\x07\x07actions\x00\x00\x0
>                              0\x07\x00\x00\x00\x0BBLOOMFILTER\x00\x00\x00\x05false\x00\x00\x00\x0BCOMPRESSION\
>                              x00\x00\x00\x04NONE\x00\x00\x00\x08VERSIONS\x00\x00\x00\x013\x00\x00\x00\x03TTL\x
>                              00\x00\x00\x0A2147483647\x00\x00\x00\x09BLOCKSIZE\x00\x00\x00\x0565536\x00\x00\x0
>                              0\x09IN_MEMORY\x00\x00\x00\x05false\x00\x00\x00\x0ABLOCKCACHE\x00\x00\x00\x04true
>                              \xE1\xDF\xF8p
>  test1,1204765,1266609171581 column=info:regioninfo, timestamp=1266609172212, value=REGION => {NAME => 'test1,
>                              1204765,1266609171581', STARTKEY => '1204765', ENDKEY => '1226169', ENCODED => 21
>                              34878372, TABLE => {{NAME => 'test1', FAMILIES => [{NAME => 'actions', VERSIONS =
>                              > '3', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMOR
>                              Y => 'false', BLOCKCACHE => 'true'}]}}
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.