You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ian Friedman <ia...@flurry.com> on 2015/02/26 21:03:30 UTC

HBase and YARN/MR

Hi all,

We're currently moving to Hadoop 2 (years behind, I know) and debating how to handle job resource management using YARN where nearly 100% of our jobs are maps over HBase Tables and a large portion also Reduce to HBase. While YARN adequately handles the resources of the machine its tasks are running on, for the purposes of TableMapper jobs, the resources consumed are actually on the remote regionserver, which YARN doesn't seem to be able to recognize. We've implemented things such as per-job concurrent task limits to help deal with this on Hadoop 1, but that seems hard to do in Hadoop 2. I'm wondering if anyone has best practices or any ideas on how to deal with an all HBase, heavily I/O and RegionServer memory/RPC bound workload? Thanks in advance!

--Ian