You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2010/01/25 19:21:34 UTC

[jira] Updated: (HBASE-2167) PE for IHBase

     [ https://issues.apache.org/jira/browse/HBASE-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-2167:
-------------------------

    Attachment: IdxPerformanceEvaluation.patch

Adds modification to the PerformanceEvaluation class to facilitate a more extensible performance evaluation platform.  Has a  new addition, the 'filterScan' command, which, as the name suggests, performs scans using a filter.  

To run the test you'll need to:

Include the contrib jars (export HBASE_CLASSPATH=(`find /home/stack/tmp/hadoop-hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s "\n" ":"`)
Set the 'hbase.hregion.impl' property to 'org.apache.hadoop.hbase.regionserver.IdxRegion' in your hbase-site.xml

bin/hbase org.apache.hadoop.hbase.IdxPerformanceEvaluation randomWrite 1
bin/hbase org.apache.hadoop.hbase.IdxPerformanceEvaluation filterScan 1
bin/hbase org.apache.hadoop.hbase.IdxPerformanceEvaluation idxFilterScan 1

PE is toward the wrong end of the spectrum regards what suits IHBase with its "large, random" value.  It uses loads of RAM.  Writes are slowed because of index insertion of such a 'large' value.  

If a user did have PE-like values, then suggest that user extract a portion of the value (like the first 10 bytes) into a separate column.qualifier and index that.  It would still provide a HUGE performance boost to scans without the huge memory footprint (writes would be slowed much less)

Here are some initial times usin to complete 20 scans for 20 random values on a single node cluster with 1.5GB of memory allocated to the RS VM.

Without an index: 732989ms at offset 0 for 1048576 rows
With an index: 2160ms at offset 0 for 1048576 rows

> PE for IHBase
> -------------
>
>                 Key: HBASE-2167
>                 URL: https://issues.apache.org/jira/browse/HBASE-2167
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>             Fix For: 0.20.4
>
>         Attachments: IdxPerformanceEvaluation.patch
>
>
> Add a PE that can be used by IHBase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.