You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Luke Lu (JIRA)" <ji...@apache.org> on 2013/12/05 00:46:36 UTC

[jira] [Commented] (HBASE-4163) Create Split Strategy for YCSB Benchmark

    [ https://issues.apache.org/jira/browse/HBASE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839489#comment-13839489 ] 

Luke Lu commented on HBASE-4163:
--------------------------------

Tried to figure this out for somebody today, here is a hbase shell one-liner to save some more people's time before the feature is implemented:
{code}
create 'usertable', 'family', {SPLITS => (1..200).map {|i| "user#{1000+i*(9999-1000)/200}"}, MAX_FILESIZE => 4*1024**3}
{code}

> Create Split Strategy for YCSB Benchmark
> ----------------------------------------
>
>                 Key: HBASE-4163
>                 URL: https://issues.apache.org/jira/browse/HBASE-4163
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>    Affects Versions: 0.90.3, 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Lars George
>            Priority: Minor
>              Labels: benchmark
>
> Talked with Lars about how we can make it easier for users to run the YCSB benchmarks against HBase & get realistic results.  Currently, HBase is optimized for the random/uniform read/write case, which is the YCSB load.  The initial reason why we perform bad when users test against us is because they do not presplit regions & have the split ratio really low.  We need a one-line way for a user to create a table that is pre-split to 200 regions (or some decent number) by default & disable splitting.  Realistically, this is how a uniform load cluster should scale, so it's not a hack.  This will also give us a good use case to point to for how users should pre-split regions.



--
This message was sent by Atlassian JIRA
(v6.1#6144)