You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Thomas Kwan <th...@manage.com> on 2014/07/29 02:34:06 UTC

Get operation slows in MR job

Hi there,

I have a MR job that does Get and then Put operation to a Hbase table.
For the write operation, I am using TableOutputFormat (like to the map
function in
https://github.com/larsgeorge/hbase-book/blob/master/ch07/src/main/java/mapreduce/ImportFromFile.java
).
The write is pretty fast, 200K requests/sec.

But the read operations are slow, like 2K requests/sec. I wonder if anyone
has recommendation on how to improve read operations. I am using batched
Gets already.

thanks in advance

Re: Get operation slows in MR job

Posted by anil gupta <an...@gmail.com>.
Inline..

On Mon, Jul 28, 2014 at 5:34 PM, Thomas Kwan <th...@manage.com> wrote:

> Hi there,
>
> I have a MR job that does Get and then Put operation to a Hbase table.
> For the write operation, I am using TableOutputFormat (like to the map
> function in
>
> https://github.com/larsgeorge/hbase-book/blob/master/ch07/src/main/java/mapreduce/ImportFromFile.java
> ).
> The write is pretty fast, 200K requests/sec.
>
Are you getting 200K writes/sec per node? or across the cluster?

>
> But the read operations are slow, like 2K requests/sec. I wonder if anyone
> has recommendation on how to improve read operations. I am using batched
> Gets already.
>
Read operation latency depends on a lot of factors. Some important of them
are:
1. Did you enable short circuit reads? If yes, what is blockLocality index?
2.  What is blockcache ratio?
3. You can also tweak with Block Size in HBase(i guess default is 64KB).
Smaller the block size faster the lookup, but it will take more memory.
4. Set caching on client side.
5. In some use cases enabling compression also helps especially if your
Disk IO is too high.

>
> thanks in advance
>



-- 
Thanks & Regards,
Anil Gupta

Re: Get operation slows in MR job

Posted by Ravindra <ra...@gmail.com>.
Hi Thomas,

Putting this in MR job could result in some performance improvement

scan.setCaching(500);
scan.setCacheBlocks(false);

Regards,
Ravindra


On Tue, Jul 29, 2014 at 6:04 AM, Thomas Kwan <th...@manage.com> wrote:

> Hi there,
>
> I have a MR job that does Get and then Put operation to a Hbase table.
> For the write operation, I am using TableOutputFormat (like to the map
> function in
>
> https://github.com/larsgeorge/hbase-book/blob/master/ch07/src/main/java/mapreduce/ImportFromFile.java
> ).
> The write is pretty fast, 200K requests/sec.
>
> But the read operations are slow, like 2K requests/sec. I wonder if anyone
> has recommendation on how to improve read operations. I am using batched
> Gets already.
>
> thanks in advance
>