You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Lin Ma <li...@gmail.com> on 2012/08/22 18:57:51 UTC

asking for advice to improve performance

Hi HBase masters,

I am reading HBase in reducer. In reducer, the key is student ID, value is
book ID (key + value, means the book read by the student at one time).
HBase is using book ID as row-key. In reducer, I query HBase by book ID and
fetching information like author, price, and other information like
abstract of the book. The HBase contains large volume of information,
Million level of records (books). My concern is reading HBase will slow
down reducer since remote I/O (reducer may read data belongs to a remote
region server) is used to fetch data from HBase, and I am also not
confident about HBase cache hit rate, since access pattern for book is
random (student may read any book).

Any advice for improving performance is appreciated, including change HBase
schema. Thanks.

regards,
Lin