You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org> on 2011/08/04 02:37:28 UTC
[jira] [Created] (HBASE-4163) Create Split Strategy for YCSB
Benchmark
Create Split Strategy for YCSB Benchmark
----------------------------------------
Key: HBASE-4163
URL: https://issues.apache.org/jira/browse/HBASE-4163
Project: HBase
Issue Type: Improvement
Components: util
Affects Versions: 0.90.3, 0.92.0
Reporter: Nicolas Spiegelberg
Assignee: Lars George
Priority: Minor
Talked with Lars about how we can make it easier for users to run the YCSB benchmarks against HBase & get realistic results. Currently, HBase is optimized for the random/uniform read/write case, which is the YCSB load. The initial reason why we perform bad when users test against us is because they do not presplit regions & have the split ratio really low. We need a one-line way for a user to create a table that is pre-split to 200 regions (or some decent number) by default & disable splitting. Realistically, this is how a uniform load cluster should scale, so it's not a hack. This will also give us a good use case to point to for how users should pre-split regions.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4163) Create Split Strategy for YCSB
Benchmark
Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079707#comment-13079707 ]
Jean-Daniel Cryans commented on HBASE-4163:
-------------------------------------------
That's pretty clever guys.
> Create Split Strategy for YCSB Benchmark
> ----------------------------------------
>
> Key: HBASE-4163
> URL: https://issues.apache.org/jira/browse/HBASE-4163
> Project: HBase
> Issue Type: Improvement
> Components: util
> Affects Versions: 0.90.3, 0.92.0
> Reporter: Nicolas Spiegelberg
> Assignee: Lars George
> Priority: Minor
> Labels: benchmark
>
> Talked with Lars about how we can make it easier for users to run the YCSB benchmarks against HBase & get realistic results. Currently, HBase is optimized for the random/uniform read/write case, which is the YCSB load. The initial reason why we perform bad when users test against us is because they do not presplit regions & have the split ratio really low. We need a one-line way for a user to create a table that is pre-split to 200 regions (or some decent number) by default & disable splitting. Realistically, this is how a uniform load cluster should scale, so it's not a hack. This will also give us a good use case to point to for how users should pre-split regions.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4163) Create Split Strategy for YCSB
Benchmark
Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079153#comment-13079153 ]
Nicolas Spiegelberg commented on HBASE-4163:
--------------------------------------------
My initial thought is to use the existing RegionSplitter utility. We just need to create a custom SplitAlgorithm implementation class for the YCSB key specification & tell the users to run:
{code}
bin/hbase org.apache.hadoop.hbase.util.RegionSplitter TABLE -c 200 -f FAMILY -D split.algorithm=YcsbSplit
{code}
to pre-create a table with 200 regions. To not split, we can either set hbase.hregion.max.filesize to a really high value or add a per-table split config option.
> Create Split Strategy for YCSB Benchmark
> ----------------------------------------
>
> Key: HBASE-4163
> URL: https://issues.apache.org/jira/browse/HBASE-4163
> Project: HBase
> Issue Type: Improvement
> Components: util
> Affects Versions: 0.90.3, 0.92.0
> Reporter: Nicolas Spiegelberg
> Assignee: Lars George
> Priority: Minor
> Labels: benchmark
>
> Talked with Lars about how we can make it easier for users to run the YCSB benchmarks against HBase & get realistic results. Currently, HBase is optimized for the random/uniform read/write case, which is the YCSB load. The initial reason why we perform bad when users test against us is because they do not presplit regions & have the split ratio really low. We need a one-line way for a user to create a table that is pre-split to 200 regions (or some decent number) by default & disable splitting. Realistically, this is how a uniform load cluster should scale, so it's not a hack. This will also give us a good use case to point to for how users should pre-split regions.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira