You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Erik Holstad (JIRA)" <ji...@apache.org> on 2009/01/21 22:09:59 UTC

[jira] Issue Comment Edited: (HBASE-80) [hbase] Add a cache of 'hot' cells

    [ https://issues.apache.org/jira/browse/HBASE-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665964#action_12665964 ] 

erikholstad@gmail.com edited comment on HBASE-80 at 1/21/09 1:09 PM:
------------------------------------------------------------

Sorry for not posting on this issue, even thought I have been assigned and everything :)
So the basic idea that I have been working on is to make a key/value cache to speed up random
reads.

Test setup:
Used the same test parameters that are used in the BT paper so it would be easy to compare and 
test have currently only been done on a single machine cluster with one HRegionServer. That setup
includes 1column/family and every value is 1000B.

Some numbers for testing this extremely simple cache are:
Tests done over 10000 reads
Random reads without cache: 481 r/s
                                                        481 KB/s
Random reads with cache: 4019 r/s
                                                  4019 KB/s


Some other test to compare the difference when using multiple columns/family turned out to give the
following numbers:
5 columns/family everything else the same as above.
Random reads without cache: 445 r/s
                                                        2223 KB/s
Random reads with cache: 3588 r/s
                                                  17940 KB/s

10 columns/family everything else the same as above.
Random reads without cache: 24 r/s
                                                        24000 KB/s
Random reads with cache: 25 r/s
                                                  25000 KB/s 

For the rest of the test only 100 rows where used to avoid out of memory errors.
Like first test but fewer rows:
Random reads without cache: 284 r/s
                                                        284 KB/s
Random reads with cache: 2083 r/s
                                                  2083 KB/s

Same as above but with 1000 columns/family
Random reads without cache: 23 r/s
                                                        23000 KB/s
Random reads with cache: 76 r/s
                                                  76000 KB/s

      was (Author: erikholstad@gmail.com):
    Sorry for not posting on this issue, even thought I have been assigned and everything :)
So the basic idea that I have been working on is to make a key/value cache to speed up random
reads.

Test setup:
Used the same test parameters that are used in the BT paper so it would be easy to compare and 
test have currently only been done on a single machine cluster with one HRegionServer. That setup
includes 1column/family and every value is 1000B.

Some numbers for testing this extremely simple cache are:
Tests done over 10000 reads
Random reads without cache: 481 r/s
                                                        481 KB/s
Random reads with cache: 4019 r/s
                                                  4019 KB/s


Some other test to compare the difference when using multiple columns/family turned out to give the
following numbers:
5 columns/family everything else the same as above.
Random reads without cache: 445 r/s
                                                        2223 KB/s
Random reads without cache: 3588 r/s
                                                        17940 KB/s

10 columns/family everything else the same as above.
Random reads without cache: 24 r/s
                                                        24000 KB/s
Random reads without cache: 25 r/s
                                                        25000 KB/s 

For the rest of the test only 100 rows where used to avoid out of memory errors.
Like first test but fewer rows:
Random reads without cache: 284 r/s
                                                        284 KB/s
Random reads with cache: 2083 r/s
                                                  2083 KB/s

Same as above but with 1000 columns/family
Random reads without cache: 23 r/s
                                                        23000 KB/s
Random reads with cache: 76 r/s
                                                  76000 KB/s
  
> [hbase] Add a cache of 'hot' cells
> ----------------------------------
>
>                 Key: HBASE-80
>                 URL: https://issues.apache.org/jira/browse/HBASE-80
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: stack
>            Assignee: Erik Holstad
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: cache.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.