You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Lu Dillon <lu...@hotmail.com> on 2018/08/20 10:11:35 UTC

hbase logs show the performance of compact isnt good. any suggstion to find out the reason?

Hi All,


I'm a newbie against Hbase. When I use YCSB to load data into Hbase, the compaction in two cluster looks different.

DEV: 6 nodes. A and B are master and standby for HDFS and Hbase. C/D/E/F are the HDFS datanode and regionservers.

PROD: 8 nodes. A and B are master and standby for HDFS and Hbase. C/D/E/F/G/H are the HDFS datanode and regionservers.

Network is 1000mbs.

Hadoop: 2.7.3

Hbase: 1.2.6

JDK:1.8

Zookeep: install on C/D/E



In DEV, I notice the compaction 4G files are about 30 seconds; however, it tooks about 30 seconds to process 400MB files in PROD.

For example, this is logs from one node of DEV:

2018-08-10 16:30:47,535 INFO  [regionserver/node103/172.28.200.103:16020-shortCompactions-1533803656579] regionserver.HStore: Completed compaction of 6 file(s) in family of usertable_nosplit,,1533878791857.be420543a609c32475386855dee93a82. into abe3a5c0dbbd4708ae2523f9b4a53ad8(size=1.4 G), total size for store is 4.0 G. This selection was in queue for 0sec, and took 18sec to execute.


This is from PROD:

2018-08-17 23:29:28,751 INFO  [regionserver/YJB-HADOOP-74-23/192.168.74.23:16020-shortCompactions-1534519629607] regionserver.HStore: Completed compaction of 3 (all) file(s) in family of usertable_nosplit,,1534518587365.95595277af1c56260b85de113c9687f1. into 934669bf6a8a407aa6ee385439c4a7b5(size=430.8 M), total size for store is 430.8 M. This selection was in queue for 0sec, and took 30sec to execute.


I thought this might be a problem of GC. So I change to use CG1 with this article "https://product.hubspot.com/blog/g1gc-tuning-your-hbase-cluster" . The performance improve a little, not to good.


Is there any suggestion to find out what causes the problem and improve the performance?


BTW: I'm not sure what kind information can help so I attach my hbase-site.xml and hbase-env.sh first.

BTW2: to use GC1 against regionserver, I create a speraeted hbase-env.sh for regionserver.


Thanks,

Dillon

Re: hbase logs show the performance of compact isnt good. any suggstion to find out the reason?

Posted by Stack <st...@duboce.net>.
Hello Lu:

Any other environmental or hardware differences between dev and prod? E.g.
is the compaction the only thing running on prod? Are there adjacent
processes that might be using up i/o or network? Are the machines same in
prod as in dev?

Also, you compare one compaction on dev to another on prod. If you compare
a larger sample set, do you see the phenomenon still or is it this single
compaction only?

There is a compaction tool that you might run on both clusters to more
directly compare compaction throughputs:
http://hbase.apache.org/book.html#compaction.tool

S

On Mon, Aug 20, 2018 at 3:11 AM Lu Dillon <lu...@hotmail.com> wrote:

> Hi All,
>
>
> I'm a newbie against Hbase. When I use YCSB to load data into Hbase, the
> compaction in two cluster looks different.
>
> DEV: 6 nodes. A and B are master and standby for HDFS and Hbase. C/D/E/F
> are the HDFS datanode and regionservers.
>
> PROD: 8 nodes. A and B are master and standby for HDFS and Hbase. C/D/E/F/G/H
> are the HDFS datanode and regionservers.
>
> Network is 1000mbs.
>
> Hadoop: 2.7.3
>
> Hbase: 1.2.6
>
> JDK:1.8
>
> Zookeep: install on C/D/E
>
>
>
> In DEV, I notice the compaction 4G files are about 30 seconds; however, it
> tooks about 30 seconds to process 400MB files in PROD.
>
> For example, this is logs from one node of DEV:
>
> 2018-08-10 16:30:47,535 INFO  [regionserver/node103/172.28.200.103:16020-shortCompactions-1533803656579]
> regionserver.HStore: Completed compaction of 6 file(s) in family of
> usertable_nosplit,,1533878791857.be420543a609c32475386855dee93a82. into
> abe3a5c0dbbd4708ae2523f9b4a53ad8(size=1.4 G), total size for store is 4.0
> G. This selection was in queue for 0sec, and took 18sec to execute.
>
>
> This is from PROD:
>
> 2018-08-17 23:29:28,751 INFO
>  [regionserver/YJB-HADOOP-74-23/192.168.74.23:16020-shortCompactions-1534519629607]
> regionserver.HStore: Completed compaction of 3 (all) file(s) in family of
> usertable_nosplit,,1534518587365.95595277af1c56260b85de113c9687f1. into
> 934669bf6a8a407aa6ee385439c4a7b5(size=430.8 M), total size for store is
> 430.8 M. This selection was in queue for 0sec, and took 30sec to execute.
>
>
> I thought this might be a problem of GC. So I change to use CG1 with this
> article "https://product.hubspot.com/blog/g1gc-tuning-your-hbase-cluster"
> . The performance improve a little, not to good.
>
>
> Is there any suggestion to find out what causes the problem and improve
> the performance?
>
>
> BTW: I'm not sure what kind information can help so I attach my
> hbase-site.xml and hbase-env.sh first.
>
> BTW2: to use GC1 against regionserver, I create a speraeted hbase-env.sh
> for regionserver.
>
>
> Thanks,
>
> Dillon
>