You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Suman Srinivasan <su...@longtailvideo.com> on 2012/07/16 17:44:13 UTC

HBase write performance benchmarking script

Hi all,

I couldn't find anything like this, so I've put together what I hope is a fairly simple but comprehensive test script to evaluate write performance on a HBase cluster that is running Thrift:
https://gist.github.com/3085350

This is written in Python, and requires the installation of HappyBase (sudo easy_install happybase) and a running HBase Thrift interface (hbase-daemon.sh start thrift) to the cluster.

This script is meant to test the write performance of a HBase cluster according to various parameters; it writes random row keys and you can fine-tune the following parameters:
1. Number of write "threads" (actually processes) to run in parallel
2. Number of puts that are batched together (make this 1 to remove batch puts and test raw single-put operations)
3. Total number of rows written to the cluster
4. Specify multiple Thrift servers for the cluster (if you have more than one Thrift server)
5. Row key: by modifying line #34 that generates the random row key, you can make the row-key closer to your application needs

I'm fairly new to the HBase world, so if there are any major mistakes to this, feel free to share feedback or fork the code on GitHub and improve on it.

Thank you,
Suman

-- 
Suman Srinivasan

JW Player | Bits on the Run | LongTail.tv
www.longtailvideo.com