You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "LN (JIRA)" <ji...@apache.org> on 2008/07/15 09:59:31 UTC

[jira] Created: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

scaling of one regionserver, improving memory and cpu usage
-----------------------------------------------------------

                 Key: HBASE-745
                 URL: https://issues.apache.org/jira/browse/HBASE-745
             Project: Hadoop HBase
          Issue Type: Improvement
          Components: regionserver
    Affects Versions: 0.1.3
         Environment: hadoop 0.17.1
            Reporter: LN
            Priority: Minor


after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.

first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 

besides hardware, following are software bottlenecks i found in regionserver:
1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
2. memory and socket connection usage are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

will explain above in comments later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615532#action_12615532 ] 

stack commented on HBASE-745:
-----------------------------

Hmm.  Took another look.  Comparison is a little more complicated than I above suppose.   I did a recheck of the number of data files post completion of the without patch run, about ten minutes after it ended; about the same amount of time that had elapsed when I went to check the withpatch test.  The number of data files is rising as is the aggregate of all time spent compacting.  Would seem then that the patch cuts time spent compacting by some 10-20% or so in the test I just ran.

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: hbase-745-for-0.2.patch, HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "LN (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614291#action_12614291 ] 

LN commented on HBASE-745:
--------------------------

> With respect to MapFile extensions in HBase, see HStoreFile$HBaseMapFile, HStoreFile$BloomFilterMapFile > and HStoreFile$HalfMapFileReader

i have noticed the inheritance between HStoreFile$xxxMapFile classes, since all xxxReader inherited from HStoreFile$HbaseMapFile$HbaseReader, it is a good point for us controlling all reading operations in HbaseReader, so my suggestion is let HbaseReader extends a new class(extends MapFile.Reader), we do limitations in there.

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614537#action_12614537 ] 

Billy Pearson commented on HBASE-745:
-------------------------------------

I tried to apply this patch for 0.2 to trunk but got an error I applied hbase-720 patch successfully first but this one failed.

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: hbase-745-for-0.2.patch, HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "LN (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614292#action_12614292 ] 

LN commented on HBASE-745:
--------------------------

about version: forget my code(patch) here, i want providing more infomations about hbase running(include patched results), that may helpful for further design and coding.  however, i think users of hbase need more stable and scaling on current release, if it can.

i will read codes from trunk. is there any other discussion about memory and compaction i can read first, in jira or wiki?  

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616918#action_12616918 ] 

Billy Pearson commented on HBASE-745:
-------------------------------------

can not test patch will not apply to trunk.

we should get this in to 2.0 f its showing good results like stack reported above


> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: Luo Ning
>            Priority: Minor
>         Attachments: hbase-745-for-0.2.patch, HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "LN (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613566#action_12613566 ] 

LN commented on HBASE-745:
--------------------------

memory calculating:
memory usage of a regionserver is determined by 3 things:
#1. the mapfile index read into memory(io.map.index.skip can adjust it, buf allwill stay in mem weather u need it or not)
#2. data output buffer used by each SequenceFile$Reader(each can measured as the largest value size in the file)
#3. memcache, controlled by 'globalMemcacheLimit' and 'globalMemcacheLimitLowMark'

that is, beside already controlled #3,  memory is determined by 'concurrent opening' mapfiles(in fact, opening SequenceFiles of mapfile data).

in HBASE-24, stack advicing control open region number or open mapfile reader number, i'd prefer contorlling opened mapfile reader directly, the core of regionserver resource usage. 

my suggestions of regionserver memory:
1. upgrade to hadoop 0.17.1(there's only one line incompatible with hadoop 0.17.1 in hbase 0.1.3, i'll file a issue seprately.), HADOOP-2346 resolved out of connection/thread in DataNode, using read/write timeout.
2. set globalMemcacheLimit to a lower size, if ur application didn't read recently inserted records frequently.
3. implment a MonitoredMapFileReader, it extends MapFile.reader, control cocurrent opening instances use LRU, checkin/checkout in every MapFile.Reader method. make HStoreFile.HbaseMapFile.HbaseReader extends MonitoredMapFileReader.

further more the release 0.1.3, i think hbase need a interface like HStoreFileReader for abstracting file reading method, that will make open reader controlling more easier.

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory and socket connection usage are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 
> will explain above in comments later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "LN (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613555#action_12613555 ] 

ln@webcate.net edited comment on HBASE-745 at 7/15/08 1:37 AM:
---------------------------------------------------

compaction time caculating:
1. suppose we are keep writing data to regionserver, and rowid of data is hashed to all regions.
2. according to default optionalcacheflushinterval(30min) and threshold(3), all HStore will create a flushed storefile in 30min, after 1 hour, each HStore will have 3 storefile(include original 1), so a compaction will taken. that is, all HStore in the regionserver will do a compaction in 1 hour.
3. a compaction of HStore will read all data in mapfiles of the HStore, i'd suppose the time of compcating is depends on total file size of mapfiles holding by the HStore. so the whole compacting time(caused by optionalcacheflushinterval) of a regionserver, depends on data size  the regionserver serving.
4. now we can see, the default optionalcacheflushinterval is not suitable for most env., i've found my hardware(Xeon 3.2*2, dualcore, scsi ) can compacting 10M data per second, this mean it can compact 36G in 1 hour, so a regionserver can only holding data size less than 36G?
5. how about increasing optionalcacheflushinterval? to 12hours, even 24hours? unfortunatly, i found it useless. because globalMemcacheLimit, it default 512M, when reached, memcache will flushed(storefile created), until total size of memcache lower than 256M, since inserted rowids are distributed to all regions, nearly half of all regions will have a new storefile too. then when inserted data reach 1G(4 times of flushing global memcache), all data of the regionserver need compaction. no setting can adjust this behavor.

      was (Author: ln@webcate.net):
    compaction time caculating:
1. suppose we are keep writing data to regionserver, and rowid of data is hashed to all regions.
2. according to default optionalcacheflushinterval(30min) and threshold(3), all HStore will have a memcache flushed storefile in 30min, after 1 hour, each HStore will have 3 storefile(include original 1), so a compaction will taken. that is, all HStore in the regionserver will do a compaction in 1 hour.
3. a compaction of HStore will read all data in mapfiles of the HStore, i'd suppose the time of compcating is depends on total file size of mapfiles. so the whole compacting time(caused by optionalcacheflushinterval) of a regionserver, depends on data size  the regionserver serving.
4. now we can see, the default optionalcacheflushinterval is not suitable for most env., i've found my hardware(Xeon 3.2*2, dualcore, scsi ) can compacting 10M data per second, this mean it can compact 36G in 1 hour, when data size larger than 36G?...
5. how about increasing optionalcacheflushinterval? to 12hours, even 24hours? unfortunatly, i found it useless. because globalMemcacheLimit, it default 512M, when it reached, memcache will flushed(storefile created), until total size of memcache lower than 256M, since inserted rowids are distributed to all regions, nearly half of all regions will have a new storefile too. then when inserted data reach 1G(4 times of flushing global memcache), all data of the regionserver compacted. no setting can adjust this behavor.
  
> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory and socket connection usage are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 
> will explain above in comments later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615487#action_12615487 ] 

viper799 edited comment on HBASE-745 at 7/21/08 6:10 PM:
--------------------------------------------------------------

now that I thank about it I thank I tryed a older version of trunk my mistake it applys to trunk.

I will run some bulk import test soon on it and see if the compaction's work out ok

      was (Author: viper799):
    now that I thank about it I thank I tryed a older version of trunk my mistake it applys to trunk.
When do you thank we can see this committed?
  
> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: hbase-745-for-0.2.patch, HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615538#action_12615538 ] 

Billy Pearson commented on HBASE-745:
-------------------------------------

I lost my data I was running to test large import so re downloading I will be able to run my test on the patch in about 24 hours when I get done processing my dataset again.
Your last post sounds better and more correct if the patch is working correct we pick up efficiency when we do not have to compact the larger mapfiles with every compaction.
I would assume this will help with the users that still have 32bit server keep the region server under the 2gb limit by flushing a little more often if needed under load.


> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: hbase-745-for-0.2.patch, HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613650#action_12613650 ] 

Jim Kellerman commented on HBASE-745:
-------------------------------------

With respect to MapFile extensions in HBase, see HStoreFile$HBaseMapFile, HStoreFile$BloomFilterMapFile and HStoreFile$HalfMapFileReader

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "LN (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614291#action_12614291 ] 

ln@webcate.net edited comment on HBASE-745 at 7/17/08 3:25 AM:
---------------------------------------------------

> With respect to MapFile extensions in HBase, see HStoreFile$HBaseMapFile, HStoreFile$BloomFilterMapFile and HStoreFile$HalfMapFileReader

i have noticed the inheritance between HStoreFile$xxxMapFile classes, since all xxxReader inheriting HStoreFile$HbaseMapFile$HbaseReader, it is a good point for us controlling all reading operations in HbaseReader, so my suggestion is let HbaseReader extends a new class(extends MapFile.Reader), we do limitations in there.

      was (Author: ln@webcate.net):
    > With respect to MapFile extensions in HBase, see HStoreFile$HBaseMapFile, HStoreFile$BloomFilterMapFile > and HStoreFile$HalfMapFileReader

i have noticed the inheritance between HStoreFile$xxxMapFile classes, since all xxxReader inherited from HStoreFile$HbaseMapFile$HbaseReader, it is a good point for us controlling all reading operations in HbaseReader, so my suggestion is let HbaseReader extends a new class(extends MapFile.Reader), we do limitations in there.
  
> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "Izaak Rubin (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Izaak Rubin updated HBASE-745:
------------------------------

    Attachment: hbase-745-for-0.2.patch

I've been looking over the issue, and I (and Stack) agree with LN and the changes proposed in his patch.  However, as Jim noted, we want to be focusing on 0.2 instead of 0.1.3.  I've taken LN's patch and modified it slightly to fit into trunk (hbase-745-for-0.2.patch).  I've also added several additional assertions to TestCompaction to account for the changes.

All HBase tests passed successfully.  However, this patch SHOULD NOT be applied until after HBase-720 is resolved and it's patch (hbase-720.patch) is applied.  Both of these patches modify the same two files (HStore, TestCompaction), and they must be committed in the correct order (first 720, then 745).  

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: hbase-745-for-0.2.patch, HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615531#action_12615531 ] 

stack commented on HBASE-745:
-----------------------------

I applied hbase-745-for-0.2.patch, Izaak's fixup of LN's original patch though little discernible improvement.

Running the PerformanceEvaluation with the patch, we spent about 20% less time compacting in total but on test completion, there were 79 data files in the filesystem as opposed to 72 when I ran without the patch.  My guess is that after the 79 files became 72, there wouldn't be much of the 20% difference left over.

Test ran for about 30 minutes running 8 concurrent MR clients writing 8M rows.

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: hbase-745-for-0.2.patch, HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-745.
-------------------------

       Resolution: Fixed
    Fix Version/s: 0.2.0

Bulk of this work was applied to 0.2.0.  I opened HBASE-823 to do Luo Ning's '"open mapfile reader" limitation patch.  Thanks for the patch Luo (and Izaak).

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: Luo Ning
>            Priority: Minor
>             Fix For: 0.2.0
>
>         Attachments: hbase-745-for-0.2.patch, HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "LN (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614429#action_12614429 ] 

LN commented on HBASE-745:
--------------------------

maybe i'm hungering for hbase stronger:-) i know Robustness and Scalabilit(in order) are focused by 0.2 release. and "3TB of data on about ~50 nodes" means 60G per regionserver, not very hard, each (default config) regionserver can handle 30G data on my testing server, by 0.1.3.

i'm trying to make regionserver handling more data, 1T? because i think the resource(memory, cpu) usage of a regionserver should not depends on existing data size, but active data size(read/write throughput). 

i think i found the bottlenecks(compaction eating cpu, open mapfiles eating memory), but NOT SURE my solution, so i paste here for review, esp. from Jim and Stack. 

here my 'total solution', i named it '0.1.3/0.17.1 scalability pack':
1. patch HBASE-749 for 0.17.1 compatible
2. patch HADOOP-3778 for a socket exception bug
3. HADOOP-3779 for concurrent connection limitation of datanode(patch not attached)
4. attached incremental compaction patch
5. a "open mapfile reader" limitaion patch, implemented my suggestion above, but looks not good, so havn't attach.

with above and adjusting some config properties, i have my regionserver handling about 400G data now, with about 15G testing write throughput per day.

 

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "LN (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

LN updated HBASE-745:
---------------------

    Attachment: HBASE-745.compact.patch

incremental compaction patch for 0.1.3 release. i use a simple algorithm for automate selecting compacting files, described in source.

sorry for no unit test case for this patch, i haven't learn how to prepare unit test data for such issues:<. in fact, this patch has worked about a week in my test env. most of compation time reduced to less than 5sec from 1min before.

btw, i removed some modification to HStore.java 0.1.3 release version in this patch manually, those for hadoop 0.17.1 compatible.

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "Izaak Rubin (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615420#action_12615420 ] 

Izaak Rubin commented on HBASE-745:
-----------------------------------

Hi Billy,

I can't seem to replicate this problem - I removed my local copies of HStore and TestCompaction, updated, and then applied hbase-745-for-0.2.patch (successfully).  The patch for HBase-720 was applied before you made your comment on Thursday (although the issue was only closed today) - is it possible that when you tried to apply the hbase-720 patch, you actually removed it by accident?  Maybe try what I did (remove the files, update, and re-apply 745) and see if it still doesn't work - let me know.

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: hbase-745-for-0.2.patch, HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615487#action_12615487 ] 

Billy Pearson commented on HBASE-745:
-------------------------------------

now that I thank about it I thank I tryed a older version of trunk my mistake it applys to trunk.
When do you thank we can see this committed?

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: hbase-745-for-0.2.patch, HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "LN (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613557#action_12613557 ] 

LN commented on HBASE-745:
--------------------------

compaction improvement:

compaction has very poor efficiency in current hbase release(0.1.3), suppose 3 mapfile in a HStore, the 1 orginal is 128M, and newly flushed 2 is smaller than 1M(this is the most common situation where regionserver carrying 512 hstore or more, flushing 256M global mamcache each time), we compacted 2M data, but read and write 120M!

my suggestion:
1. set threshold larger, this will cause lower compaction times, but more mapfiles(will discuss later in this issue about memory usage)
2. implementing incremental compaction, that's mean: don't compact to 1 file each time, compact small files only, 
do a whole compaction when file size large enough. in HStore#compact(boolean), we can use a alorighm to select hstorefiles for compacting. (will attach my impl for review later.)


> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory and socket connection usage are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 
> will explain above in comments later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "LN (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613555#action_12613555 ] 

LN commented on HBASE-745:
--------------------------

compaction time caculating:
1. suppose we are keep writing data to regionserver, and rowid of data is hashed to all regions.
2. according to default optionalcacheflushinterval(30min) and threshold(3), all HStore will have a memcache flushed storefile in 30min, after 1 hour, each HStore will have 3 storefile(include original 1), so a compaction will taken. that is, all HStore in the regionserver will do a compaction in 1 hour.
3. a compaction of HStore will read all data in mapfiles of the HStore, i'd suppose the time of compcating is depends on total file size of mapfiles. so the whole compacting time(caused by optionalcacheflushinterval) of a regionserver, depends on data size  the regionserver serving.
4. now we can see, the default optionalcacheflushinterval is not suitable for most env., i've found my hardware(Xeon 3.2*2, dualcore, scsi ) can compacting 10M data per second, this mean it can compact 36G in 1 hour, when data size larger than 36G?...
5. how about increasing optionalcacheflushinterval? to 12hours, even 24hours? unfortunatly, i found it useless. because globalMemcacheLimit, it default 512M, when it reached, memcache will flushed(storefile created), until total size of memcache lower than 256M, since inserted rowids are distributed to all regions, nearly half of all regions will have a new storefile too. then when inserted data reach 1G(4 times of flushing global memcache), all data of the regionserver compacted. no setting can adjust this behavor.

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory and socket connection usage are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 
> will explain above in comments later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "LN (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

LN updated HBASE-745:
---------------------

    Description: 
after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.

first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 

besides hardware, following are software bottlenecks i found in regionserver:
1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
2. memory usage are depends on opened mapfiles
3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

  was:
after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.

first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 

besides hardware, following are software bottlenecks i found in regionserver:
1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
2. memory and socket connection usage are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

will explain above in comments later.


> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616962#action_12616962 ] 

stack commented on HBASE-745:
-----------------------------

Hey Billy:  Yeah, it was committed a while back.   In my comments above, I'm not very enthusiastic because I did not see BIG gains in our simple PerformanceEvaluation.  But thinking on it more, Luo Ning's simple rule is kinda elegant and in real-life situations is probably saving truckloads of CPU and I/O.

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: Luo Ning
>            Priority: Minor
>         Attachments: hbase-745-for-0.2.patch, HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman updated HBASE-745:
--------------------------------

    Affects Version/s: 0.2.0

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615510#action_12615510 ] 

stack commented on HBASE-745:
-----------------------------

Billy, I'm running tests too.  So far, it looks like the LN patch is an improvement.  Will report back when more data.  If its good -- should know tonight -- I'll apply it.

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: hbase-745-for-0.2.patch, HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613627#action_12613627 ] 

Billy Pearson commented on HBASE-745:
-------------------------------------

I agree on your idea of a incremental compaction

My two ideas for increased efficiency in compaction while under load

1. compact only the newest threshold(3) of mapfiles

This will allow a region server to compact the lastest 3 map files created lowering the number of mapfile by 2 per compaction
the newest mapfile will not store the bulk of the data for a region if we are under load they will be small memcache flushes and compact fast.

By doing the newest ones when the load reduces and there is only 3 map files left 1 will be the largest and oldest mapfile 
and all old data and new data will get compacted together.

2. The compaction queue
Currently we only add the region to a queued list of regions needing compaction check and compact in that order.

My suggestion would be to have the queued list store how many times a region has been added to the compaction queued(memcache flushes)
That way we can sort the list and compact the hot spots under load and compact them first and reduce the number of map files the fastest with the above idea implemented.
When it is done with the compaction reduce the number in the queued by how many files we compacted or remove it if  left to compact and sort the list again start over.

these are my ideas on how we can reduce the number of mapfiles we have while we are under a write load.

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616937#action_12616937 ] 

Billy Pearson commented on HBASE-745:
-------------------------------------

looks like this has been committed to trunk 
seams to be improving my import speed because I am spending less time on compaction so I get more cpu time for transactions.
I use compression on my table so it improving my speed on compaction's by not havening to uncompress and re compress all the map files each compaction.
+1

so we need to mark this issue done.

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: Luo Ning
>            Priority: Minor
>         Attachments: hbase-745-for-0.2.patch, HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "Izaak Rubin (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615420#action_12615420 ] 

irubin edited comment on HBASE-745 at 7/21/08 1:59 PM:
------------------------------------------------------------

Hi Billy,

I can't seem to replicate this problem - I removed my local copies of HStore and TestCompaction, updated, and then applied hbase-745-for-0.2.patch (successfully).  The patch for HBase-720 was committed before you made your comment on Thursday (although the issue was only closed today) - is it possible that when you tried to apply the hbase-720 patch, you actually removed it by accident?  Maybe try what I did (remove the files, update, and re-apply 745) and see if it still doesn't work - let me know.

      was (Author: irubin):
    Hi Billy,

I can't seem to replicate this problem - I removed my local copies of HStore and TestCompaction, updated, and then applied hbase-745-for-0.2.patch (successfully).  The patch for HBase-720 was applied before you made your comment on Thursday (although the issue was only closed today) - is it possible that when you tried to apply the hbase-720 patch, you actually removed it by accident?  Maybe try what I did (remove the files, update, and re-apply 745) and see if it still doesn't work - let me know.
  
> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: hbase-745-for-0.2.patch, HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614357#action_12614357 ] 

stack commented on HBASE-745:
-----------------------------

LN: I'm not sure I follow the above comment.  What you thinking?  Yes, hbase users need stability in 0.1.3 and in 0.2.  Lets experiment in 0.3.

No discussion of memory or compaction other than what is in JIRAs.  Want to start up a wiki page that we can all hack on?

FYI, Izaak is working on upgrading your patch so it works against TRUNK.

> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3, 0.2.0
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-745) scaling of one regionserver, improving memory and cpu usage

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613652#action_12613652 ] 

Jim Kellerman commented on HBASE-745:
-------------------------------------

I would also suggest that with respect to performance, you should focus on trunk and not 0.1.x because trunk has changed the internals of flushing and compaction quite a bit, and it is unlikely that performance improvements for 0.1.x will port easily to trunk.



> scaling of one regionserver, improving memory and cpu usage
> -----------------------------------------------------------
>
>                 Key: HBASE-745
>                 URL: https://issues.apache.org/jira/browse/HBASE-745
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.3
>         Environment: hadoop 0.17.1
>            Reporter: LN
>            Priority: Minor
>         Attachments: HBASE-745.compact.patch
>
>
> after weeks testing hbase 0.1.3 and hadoop(0.16.4, 0.17.1), i found there are many works to do,  before a particular regionserver can handle data about 100G, or even more. i'd share my opions here with stack, and other developers.
> first, the easiest way improving scalability of regionserver is upgrading hardware, use 64bit os and 8G memory for the regionserver process, and speed up disk io. 
> besides hardware, following are software bottlenecks i found in regionserver:
> 1. as data increasing, compaction was eating cpu(with io) times, the total compaction time is basicly linear relative to whole data size, even worse, sometimes square relavtive to that size.
> 2. memory usage are depends on opened mapfiles
> 3. network connection are depends on opened mapfiles, see HADOOP-2341 and HBASE-24. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.