You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Erik Holstad (JIRA)" <ji...@apache.org> on 2009/01/21 22:09:59 UTC
[jira] Issue Comment Edited: (HBASE-80) [hbase] Add a cache of
'hot' cells
[ https://issues.apache.org/jira/browse/HBASE-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665964#action_12665964 ]
erikholstad@gmail.com edited comment on HBASE-80 at 1/21/09 1:09 PM:
------------------------------------------------------------
Sorry for not posting on this issue, even thought I have been assigned and everything :)
So the basic idea that I have been working on is to make a key/value cache to speed up random
reads.
Test setup:
Used the same test parameters that are used in the BT paper so it would be easy to compare and
test have currently only been done on a single machine cluster with one HRegionServer. That setup
includes 1column/family and every value is 1000B.
Some numbers for testing this extremely simple cache are:
Tests done over 10000 reads
Random reads without cache: 481 r/s
481 KB/s
Random reads with cache: 4019 r/s
4019 KB/s
Some other test to compare the difference when using multiple columns/family turned out to give the
following numbers:
5 columns/family everything else the same as above.
Random reads without cache: 445 r/s
2223 KB/s
Random reads with cache: 3588 r/s
17940 KB/s
10 columns/family everything else the same as above.
Random reads without cache: 24 r/s
24000 KB/s
Random reads with cache: 25 r/s
25000 KB/s
For the rest of the test only 100 rows where used to avoid out of memory errors.
Like first test but fewer rows:
Random reads without cache: 284 r/s
284 KB/s
Random reads with cache: 2083 r/s
2083 KB/s
Same as above but with 1000 columns/family
Random reads without cache: 23 r/s
23000 KB/s
Random reads with cache: 76 r/s
76000 KB/s
was (Author: erikholstad@gmail.com):
Sorry for not posting on this issue, even thought I have been assigned and everything :)
So the basic idea that I have been working on is to make a key/value cache to speed up random
reads.
Test setup:
Used the same test parameters that are used in the BT paper so it would be easy to compare and
test have currently only been done on a single machine cluster with one HRegionServer. That setup
includes 1column/family and every value is 1000B.
Some numbers for testing this extremely simple cache are:
Tests done over 10000 reads
Random reads without cache: 481 r/s
481 KB/s
Random reads with cache: 4019 r/s
4019 KB/s
Some other test to compare the difference when using multiple columns/family turned out to give the
following numbers:
5 columns/family everything else the same as above.
Random reads without cache: 445 r/s
2223 KB/s
Random reads without cache: 3588 r/s
17940 KB/s
10 columns/family everything else the same as above.
Random reads without cache: 24 r/s
24000 KB/s
Random reads without cache: 25 r/s
25000 KB/s
For the rest of the test only 100 rows where used to avoid out of memory errors.
Like first test but fewer rows:
Random reads without cache: 284 r/s
284 KB/s
Random reads with cache: 2083 r/s
2083 KB/s
Same as above but with 1000 columns/family
Random reads without cache: 23 r/s
23000 KB/s
Random reads with cache: 76 r/s
76000 KB/s
> [hbase] Add a cache of 'hot' cells
> ----------------------------------
>
> Key: HBASE-80
> URL: https://issues.apache.org/jira/browse/HBASE-80
> Project: Hadoop HBase
> Issue Type: Improvement
> Components: regionserver
> Reporter: stack
> Assignee: Erik Holstad
> Priority: Minor
> Fix For: 0.20.0
>
> Attachments: cache.patch
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.