You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "stack (JIRA)" <ji...@apache.org> on 2007/07/27 21:49:52 UTC

[jira] Created: (HADOOP-1662) [hbase] Make region splits faster

[hbase] Make region splits faster
---------------------------------

                 Key: HADOOP-1662
                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
             Project: Hadoop
          Issue Type: Improvement
          Components: contrib/hbase
            Reporter: stack
            Assignee: stack


HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1662) [hbase] Make region splits faster

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1662:
--------------------------

    Status: In Progress  (was: Patch Available)

Failed in a different dfs test this time: TestDFSStorageStateRecovery

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch, splits-v3.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1662) [hbase] Make region splits faster

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1662:
--------------------------

    Status: Patch Available  (was: In Progress)

Updated and reran local build.  All passes.  Resubmitting v3 of patch.

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch, splits-v3.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1662) [hbase] Make region splits faster

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12518327 ] 

Hadoop QA commented on HADOOP-1662:
-----------------------------------

-1, build or testing failed

2 attempts failed to build and test the latest attachment http://issues.apache.org/jira/secure/attachment/12363378/splits-v3.patch against trunk revision r563649.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/524/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/524/console

Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch, splits-v3.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1662) [hbase] Make region splits faster

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1662:
--------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Resolving. Was committed a while back.

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch, splits-v3.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Commented: (HADOOP-1662) [hbase] Make region splits faster

Posted by Nigel Daley <nd...@yahoo-inc.com>.
On Aug 8, 2007, at 12:33 PM, Michael Stack wrote:

> Nigel Daley wrote:
>> Our unit tests seem very unstable right now, which is causing lots  
>> of problems for this patch process.  I'm considering turning off  
>> the unit testing portion of the process and asking the committers  
>> to run the unit test for each patch.  Thoughts?
> Is the box Hudson runs on OK?  Failures are erratic.   Its odd that  
> I do not have the same issues building on two different linuxes --  
> one a dual-processor w/ dual-cores and the other on old single core  
> K7 -- and macosx (at least with the current TRUNK with my patch  
> applied).

The box is running many Solaris zones, thus the performance isn't great.

> I'm fine w/ disabling unit tests and having committers run the unit  
> tests themselves (Checkstyle, javadoc and findbugs would all still  
> run?).

Yes, the other checks would still run.  I'm now inclined not to do  
this as it removes the pressure to fix the underlying problems in the  
unit tests.  I'll start gathering frequencies of the failing unit tests.

Cheers,
Nige

>> On Aug 8, 2007, at 11:23 AM, Michael Stack wrote:
>>
>>> Hudson is currently hung running the 3rd attempted build of this  
>>> patch.   A restart would seem to be in order (On this 3rd run, we  
>>> made it into the hbase unit tests but DFS reads are timing out).
>>> St.Ack
>>>
>>>
>>> Hadoop QA (JIRA) wrote:
>>>>     [ https://issues.apache.org/jira/browse/HADOOP-1662? 
>>>> page=com.atlassian.jira.plugin.system.issuetabpanels:comment- 
>>>> tabpanel#action_12518366 ]
>>>> Hadoop QA commented on HADOOP-1662:
>>>> -----------------------------------
>>>>
>>>> -1, build or testing failed
>>>>
>>>> 2 attempts failed to build and test the latest attachment http:// 
>>>> issues.apache.org/jira/secure/attachment/12363378/splits- 
>>>> v3.patch against trunk revision r563649.
>>>>
>>>> Test results:   http://lucene.zones.apache.org:8080/hudson/job/ 
>>>> Hadoop-Patch/528/testReport/
>>>> Console output: http://lucene.zones.apache.org:8080/hudson/job/ 
>>>> Hadoop-Patch/528/console
>>>>
>>>> Please note that this message is automatically generated and may  
>>>> represent a problem with the automation system and not the patch.
>>>>
>>>>
>>>>> [hbase] Make region splits faster
>>>>> ---------------------------------
>>>>>
>>>>>                 Key: HADOOP-1662
>>>>>                 URL: https://issues.apache.org/jira/browse/ 
>>>>> HADOOP-1662
>>>>>             Project: Hadoop
>>>>>          Issue Type: Improvement
>>>>>          Components: contrib/hbase
>>>>>            Reporter: stack
>>>>>            Assignee: stack
>>>>>         Attachments: fastsplits.patch, mapfile_split.patch,  
>>>>> splits-2.patch, splits-v3.patch
>>>>>
>>>>>
>>>>> HADOOP-1644 '[hbase] Compactions should take no longer than  
>>>>> period between memcache flushes' is about making compactions  
>>>>> run faster.  This issue is about making splits faster.   
>>>>> Currently splits are done by reading as input a map file and  
>>>>> per record, writing out two new mapfiles.  Its currently too  
>>>>> slow.  ~30 seconds to split 120MB. Google hints in bigtable  
>>>>> that splitting is very fast because they let the split children  
>>>>> feed off the split parent.  Primitive testing has splitting  
>>>>> mapfiles using raw streams running 3 to 4 times faster than  
>>>>> splitting on mapfile keys.
>>>>>
>>>>
>>>>
>>>
>>
>


Re: [jira] Commented: (HADOOP-1662) [hbase] Make region splits faster

Posted by Michael Stack <st...@duboce.net>.
Nigel Daley wrote:
> Our unit tests seem very unstable right now, which is causing lots of 
> problems for this patch process.  I'm considering turning off the unit 
> testing portion of the process and asking the committers to run the 
> unit test for each patch.  Thoughts?
Is the box Hudson runs on OK?  Failures are erratic.   Its odd that I do 
not have the same issues building on two different linuxes -- one a 
dual-processor w/ dual-cores and the other on old single core K7 -- and 
macosx (at least with the current TRUNK with my patch applied).

I'm fine w/ disabling unit tests and having committers run the unit 
tests themselves (Checkstyle, javadoc and findbugs would all still run?).
>
> I'll kill off the stuck test now.
Thanks Nigel (It'll probably succeed this time around).
St.Ack
>
>
> On Aug 8, 2007, at 11:23 AM, Michael Stack wrote:
>
>> Hudson is currently hung running the 3rd attempted build of this 
>> patch.   A restart would seem to be in order (On this 3rd run, we 
>> made it into the hbase unit tests but DFS reads are timing out).
>> St.Ack
>>
>>
>> Hadoop QA (JIRA) wrote:
>>>     [ 
>>> https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12518366 
>>> ]
>>> Hadoop QA commented on HADOOP-1662:
>>> -----------------------------------
>>>
>>> -1, build or testing failed
>>>
>>> 2 attempts failed to build and test the latest attachment 
>>> http://issues.apache.org/jira/secure/attachment/12363378/splits-v3.patch 
>>> against trunk revision r563649.
>>>
>>> Test results:   
>>> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/528/testReport/ 
>>>
>>> Console output: 
>>> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/528/console
>>>
>>> Please note that this message is automatically generated and may 
>>> represent a problem with the automation system and not the patch.
>>>
>>>
>>>> [hbase] Make region splits faster
>>>> ---------------------------------
>>>>
>>>>                 Key: HADOOP-1662
>>>>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>>>>             Project: Hadoop
>>>>          Issue Type: Improvement
>>>>          Components: contrib/hbase
>>>>            Reporter: stack
>>>>            Assignee: stack
>>>>         Attachments: fastsplits.patch, mapfile_split.patch, 
>>>> splits-2.patch, splits-v3.patch
>>>>
>>>>
>>>> HADOOP-1644 '[hbase] Compactions should take no longer than period 
>>>> between memcache flushes' is about making compactions run faster.  
>>>> This issue is about making splits faster.  Currently splits are 
>>>> done by reading as input a map file and per record, writing out two 
>>>> new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. 
>>>> Google hints in bigtable that splitting is very fast because they 
>>>> let the split children feed off the split parent.  Primitive 
>>>> testing has splitting mapfiles using raw streams running 3 to 4 
>>>> times faster than splitting on mapfile keys.
>>>>
>>>
>>>
>>
>


Re: [jira] Commented: (HADOOP-1662) [hbase] Make region splits faster

Posted by Nigel Daley <nd...@yahoo-inc.com>.
Our unit tests seem very unstable right now, which is causing lots of  
problems for this patch process.  I'm considering turning off the  
unit testing portion of the process and asking the committers to run  
the unit test for each patch.  Thoughts?

I'll kill off the stuck test now.


On Aug 8, 2007, at 11:23 AM, Michael Stack wrote:

> Hudson is currently hung running the 3rd attempted build of this  
> patch.   A restart would seem to be in order (On this 3rd run, we  
> made it into the hbase unit tests but DFS reads are timing out).
> St.Ack
>
>
> Hadoop QA (JIRA) wrote:
>>     [ https://issues.apache.org/jira/browse/HADOOP-1662? 
>> page=com.atlassian.jira.plugin.system.issuetabpanels:comment- 
>> tabpanel#action_12518366 ]
>> Hadoop QA commented on HADOOP-1662:
>> -----------------------------------
>>
>> -1, build or testing failed
>>
>> 2 attempts failed to build and test the latest attachment http:// 
>> issues.apache.org/jira/secure/attachment/12363378/splits-v3.patch  
>> against trunk revision r563649.
>>
>> Test results:   http://lucene.zones.apache.org:8080/hudson/job/ 
>> Hadoop-Patch/528/testReport/
>> Console output: http://lucene.zones.apache.org:8080/hudson/job/ 
>> Hadoop-Patch/528/console
>>
>> Please note that this message is automatically generated and may  
>> represent a problem with the automation system and not the patch.
>>
>>
>>> [hbase] Make region splits faster
>>> ---------------------------------
>>>
>>>                 Key: HADOOP-1662
>>>                 URL: https://issues.apache.org/jira/browse/ 
>>> HADOOP-1662
>>>             Project: Hadoop
>>>          Issue Type: Improvement
>>>          Components: contrib/hbase
>>>            Reporter: stack
>>>            Assignee: stack
>>>         Attachments: fastsplits.patch, mapfile_split.patch,  
>>> splits-2.patch, splits-v3.patch
>>>
>>>
>>> HADOOP-1644 '[hbase] Compactions should take no longer than  
>>> period between memcache flushes' is about making compactions run  
>>> faster.  This issue is about making splits faster.  Currently  
>>> splits are done by reading as input a map file and per record,  
>>> writing out two new mapfiles.  Its currently too slow.  ~30  
>>> seconds to split 120MB. Google hints in bigtable that splitting  
>>> is very fast because they let the split children feed off the  
>>> split parent.  Primitive testing has splitting mapfiles using raw  
>>> streams running 3 to 4 times faster than splitting on mapfile keys.
>>>
>>
>>
>


Re: [jira] Commented: (HADOOP-1662) [hbase] Make region splits faster

Posted by Michael Stack <st...@duboce.net>.
Hudson is currently hung running the 3rd attempted build of this 
patch.   A restart would seem to be in order (On this 3rd run, we made 
it into the hbase unit tests but DFS reads are timing out).
St.Ack


Hadoop QA (JIRA) wrote:
>     [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12518366 ] 
>
> Hadoop QA commented on HADOOP-1662:
> -----------------------------------
>
> -1, build or testing failed
>
> 2 attempts failed to build and test the latest attachment http://issues.apache.org/jira/secure/attachment/12363378/splits-v3.patch against trunk revision r563649.
>
> Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/528/testReport/
> Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/528/console
>
> Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.
>
>   
>> [hbase] Make region splits faster
>> ---------------------------------
>>
>>                 Key: HADOOP-1662
>>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>>             Project: Hadoop
>>          Issue Type: Improvement
>>          Components: contrib/hbase
>>            Reporter: stack
>>            Assignee: stack
>>         Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch, splits-v3.patch
>>
>>
>> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.
>>     
>
>   


[jira] Commented: (HADOOP-1662) [hbase] Make region splits faster

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12518366 ] 

Hadoop QA commented on HADOOP-1662:
-----------------------------------

-1, build or testing failed

2 attempts failed to build and test the latest attachment http://issues.apache.org/jira/secure/attachment/12363378/splits-v3.patch against trunk revision r563649.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/528/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/528/console

Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch, splits-v3.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1662) [hbase] Make region splits faster

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1662:
--------------------------

    Status: In Progress  (was: Patch Available)

Hudson failed my patch because a DFS unit -- org.apache.hadoop.dfs.TestCrcCorruption.testCrcCorruption.-- test failed (?)

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch, splits-v3.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1662) [hbase] Make region splits faster

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1662:
--------------------------

    Attachment: splits-v3.patch

version 3

Improvements around recovery from catastrophic loss of ROOT and
META regions (Needed to make TestRegionServerAbort work reliably
after application of this splits patch).

M src/contrib/hbase/src/test/org/apache/hadoop/hbase/MiniHBaseCluster.java
    Added logging of abort, close and wait.  Also on abort/close
    was doing a remove that made it so subsequent wait had nothing to
    wait on.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLog.java
    Debug logging around split and edits.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java
    Added toString to each of the PendingOperation implementations.
    In the ShutdownPendingOperation scan of meta data, removed
    check of startcode (if the server name is that of the dead
    server, it needs reassigning even if start code is good).
    Also, if server name is null -- possible if we are missing
    edits off end of log -- then the region should be reassigned
    just in case its from the dead server.  Also, if reassigning,
    clear from pendingRegions.  Server may have died after sending
    region is up but before the server confirms receipt in the
    meta scan. Added mare detail to each log.  In OpenPendingOperation
    we were trying to clear pendingRegion in wrong place -- it was
    never executed (regions were always pending).
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Keying.java
    (intToBytes, longToBytes, getBytes, bytesToString, bytesToLong): Added.

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch, splits-v3.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1662) [hbase] Make region splits faster

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12518532 ] 

Hadoop QA commented on HADOOP-1662:
-----------------------------------

+1

http://issues.apache.org/jira/secure/attachment/12363378/splits-v3.patch applied and successfully tested against trunk revision r563649.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/529/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/529/console

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch, splits-v3.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1662) [hbase] Make region splits faster

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1662:
--------------------------

    Attachment: fastsplits.patch

First cut at fast splits.  Missing a cleanup thread and still undergoing testing.

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: fastsplits.patch, mapfile_split.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1662) [hbase] Make region splits faster

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1662:
--------------------------

    Attachment: mapfile_split.patch

Here is a split function done as a static in mapfile so can get at the private mapfile index.  Includes unit test.  Needs testing in a loaded hbase.

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: mapfile_split.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1662) [hbase] Make region splits faster

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1662:
--------------------------

    Status: Patch Available  (was: Open)

Playing some Hudson roulette.

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch, splits-v3.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1662) [hbase] Make region splits faster

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1662:
--------------------------

    Attachment: splits-2.patch

Here is commit message to go along w/ v2 of patch:

HADOOP-1662 Make region splits faster
Splits are now near-instantaneous.  On split, daughter splits create
'references' to store files up in the parent region using new 'HalfMapFile'
class to proxy accesses against the top-half or bottom-half of        
backing MapFile.  Parent region is deleted after all references in daughter
regions have been let go.  

Below includes other cleanups and at least one bug fix for fails adding
>32k records.

A src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestHStoreFile.java
    Added. Tests new Reference HStoreFiles. Test new HalfMapFileReader inner
    class of HStoreFile. Test that we do the right thing when HStoreFiles
    are smaller than a MapFile index range (i.e. there is not 'MidKey').
    Test we do right thing when key is outside of a HalfMapFile.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseTestCase.java
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestGet.java
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestScanner.java
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestTimestamp.java
    getHRegionDir moved from HStoreFile to HRegion.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestBatchUpdate.java
    Let out exception rather than catch and call 'fail'.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/MiniHBaseCluster.java
    Refactored so can start and stop a minihbasecluster w/o having to
    subclass this TestCase. Refactored methods in this class to use the
    newly added methods listed below.
    (MasterThread, RegionServerThread, startMaster, startRegionServers
      shutdown): Added.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestSplit.java
    Added tests that assert all works properly at region level on
    multiple levels of splits and then do same on a cluster.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestHRegion.java
    Removed catch and 'fail()'.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java
    Javadoc to explain how split now works. Have constructors flow
    into each other rather than replicate setup per instance. Moved
    in here operations such as delete, rename, and length of store files
    (No need of clients to remember to delete map and info files).
    (REF_NAME_PARSER, Reference, HalfMapFile, isReference,
      writeReferenceFiles, writeSplitInfo, readSplitInfo,
      createOrFail, getReader, getWriter, toString): Added.
    (getMapDir, getMapFilePath, getInfoDir, getInfoFilePath): Added
    a bunch of overrides for reference handling.
    (loadHStoreFiles): Amended to load references off disk.
    (splitStoreFiles): Redone to instead write references into
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java
    Rename maps as readers and mapFiles as storefiles.
    Moved BloomFilterReader and Writer into HStoreFile. Removed
    getMapFileReader and getMapFileWriter (They are in HStoreFile now).
    (getReaders): Added.
    (HStoreSize): Added.  Data Structure to hold aggregated size
    of all HStoreFiles in HStore, the largest, its midkey, and
    if the HStore is splitable (May not be if references).
    Previous we only did largest file; less accurate.
    (getLargestFileSize): Renamed size and redone to aggregate
    sizes, etc.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HColumnDescriptor.java
    Have constructors waterfall down through each other rather than
    repeat initializations.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMerge.java
    Use new HStoreSize structure.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java
    Added delayed remove of HRegion (Now done in HMaster as part of
    meta scan). Change LOG.error and LOG.warn so they throw stack trace
    instead of just the Exception.toString as message.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConstants.java
    (COLUMN_FAMILY_STR): Added.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLog.java
    Added why to log of splitting.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLogEdit.java
    Short is not big enough to hold edits tha could contain a sizable
    web page.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HTable.java
    (getTableName): Added.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java
    Added constructor to BaseScanner that takes name of table we're
    scanning (ROOT or META usually). Added to scanOneRegion handling
    of split regions.  Collect splits to check while scanning and
    then outside of the scanning, so we can modify the META table
    is needed, do the checks of daughter regions and update on
    change of state.  Made LOG.warn and LOG.error print stack trace.
    (isSplitParent, cleanupSplits, hasReferences): Added.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInfo.java
    Add split boolean.  Output offline and split status in toString.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMemcache.java
    Comments.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
    Moved getRegionDir here from HStoreFile.
    (COL_SPLITA, COL_SPLITB): Added.
    (closeAndSplit): Refactored to use new fast split method.
       StringUtils.formatTimeDiff(System.currentTimeMillis(), startTime));
    (splitStoreFile): Moved into HStoreFile.
    (getSplitRegionDir, getSplitsDir, toString): Added.
    (needsSplit): Refactored to exploit new HStoreSize structure.
    Also manages notion of 'unsplitable' region.
    (largestHStore): Refactored.
    (removeSplitFromMETA, writeSplitToMETA, getSplit, hasReference): Added.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Keying.java
    (intToBytes, getBytes): Added.
A src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Writables.java
    Utility reading and writing Writables.

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1662) [hbase] Make region splits faster

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516445 ] 

stack commented on HADOOP-1662:
-------------------------------

Here's a proposal for another split mechanism, one that should work even faster than the above mapfile split patch.

Currently on split, each of the parent regions' store files are halved with one daughter getting a copy of the top half of all store files and the other the bottom.  On completion of the division, the parent region is deleted and all references removed from the --META-- table replaced by references to the daughters.  Divvying up the parent store files amongst its daughters takes too long.

The below proposes a split method inspired by the suggestive tail of the 'Exploiting Immutabliilty' section of the Google Bigtable paper: "...the immutability of SSTables enables us to split tablets quickly. Instead of generating a new set of SSTables for each child tablet, we let the child tablets share the SSTables of the parent tablet." 

Rather than copy the top and bottom halves of the parent 's store files to new store files in the split's daughter regions, instead, on region mitosis, have the daughters keep references to the parent.  The references will be undone on a compaction when stores that parent references are rewritten into new files under the daughter region.

Detail:

On region split, no longer immediately delete the parent.  Instead move the parent to a directory named 'split.parents'. Corralling split parents this way makes it easier distingushing split parents from live regions.

Regions have subdirectories, one per column family.  On split, add to the parent a subdirectory at the same level as column families named in a manner illegal for column families: e.g. '.splits' or ':splits'.  Into this directory, write two empty files each named for the daughter regions so we have a means of relating parent to children.

Add a cleanup thread to HMaster that runs on a long period that looks at the content of 'split.parents'.  For each, per daughter, it looks to see if the child still references the parent (See later for how the cleanup thread detects references).  If not, it deletes the pertinent daughter file from :splits.  When both have been removed -- neither of the daughters hold references to the parent -- the HMaster cleanup thread removes the parent region.

On region split, two daughter regions are written.  One will reference the top halves of the parent regions store files.  The other the bottom halves.  Codify the reference to a parent store file in the name given the daughter referring store file.  Currently store files are named like this: mapfile.dat.3948616888538006163 for mapfiles and mapfile.info.3948616888538006163 for info files where the number suffix is a random unique id .  Name store files that reference a region of a parent store file as follows: mapfile.ref.REGION_NAME.[top|bottom].  E.g. mapfile.ref.region_hbaserepository,x1GAyQ6M_A2o2B8LpmlHKk==,7524499765167357666.8418484899696132011.top.  These referencing files are empty.

Add a mapfile subclass to hbase that does the right thing when its passed a reference proxying to appropriate region on the backing parent file.  Use this hbase mapfile subclass whenever hbase loads store files.

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: mapfile_split.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1662) [hbase] Make region splits faster

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1662:
--------------------------

    Status: Patch Available  (was: In Progress)

Retry number 3.

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch, splits-v3.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster.  This issue is about making splits faster.  Currently splits are done by reading as input a map file and per record, writing out two new mapfiles.  Its currently too slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent.  Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.