You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Marcell Ortutay <mo...@23andme.com.INVALID> on 2018/06/28 19:55:03 UTC

How to avoid major compaction during restart?

Hi all,

I'm interested in ways to avoid a major compaction when restarting all the
HBase region servers in a cluster (for example, for a version upgrade). Are
there any recommended techniques for achieving this?

Thanks,
Marcell

Re: How to avoid major compaction during restart?

Posted by Stack <st...@duboce.net>.

Would disabling the hbase balancer soon after startup work for you?
S

On Thu, Jun 28, 2018 at 12:55 PM Marcell Ortutay
<mo...@23andme.com.invalid> wrote:

> Hi all,
>
> I'm interested in ways to avoid a major compaction when restarting all the
> HBase region servers in a cluster (for example, for a version upgrade). Are
> there any recommended techniques for achieving this?
>
> Thanks,
> Marcell
>

Re: How to avoid major compaction during restart?

Posted by rahul gidwani <ra...@gmail.com>.

There was a issue where hbase regionservers had higher likelihood of major
compacting on startup (if you started up a lot of clusters at once that
hadn't major compacted their files, then the chore would not jitter and
compact everything right away - only if necessary).

https://issues.apache.org/jira/browse/HBASE-17912


On Thu, Jun 28, 2018 at 1:31 PM Mingliang LIU <li...@apache.org> wrote:

> Marcell,
>
> In Hadoop side, the NameNode (NN) will not schedule block re-replication
> unless the DataNode (DN) has been claimed "dead". By default the interval
> is >10mins. Usually your DN should have restarted before being "dead" in
> NN. If that still is a concern, you can make that interval longer
> indirectly via configurations "dfs.namenode.heartbeat.recheck-interval" and
> "dfs.heartbeat.interval". The interval is calculated following this code
> <
> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L290
> >
> .
>
> Thanks,
>
> On Thu, Jun 28, 2018 at 1:02 PM Marcell Ortutay
> <mo...@23andme.com.invalid> wrote:
>
> > Er, to I made a mistake in the above question ; the issue is not so much
> > the major compaction but rather that during restart (as nodes go up /
> > down), Hadoop and HBase attempt to rebalance blocks and regions, causing
> > unnecessary movement. So what I'm actually looking for is a way to avoid
> > the balancing for the duration of the restart, which would avoid the need
> > for major compaction afterwards.
> >
> > Marcell
> >
> > On Thu, Jun 28, 2018 at 12:55 PM, Marcell Ortutay <mo...@23andme.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > I'm interested in ways to avoid a major compaction when restarting all
> > the
> > > HBase region servers in a cluster (for example, for a version upgrade).
> > Are
> > > there any recommended techniques for achieving this?
> > >
> > > Thanks,
> > > Marcell
> > >
> > >
> >
>

Re: How to avoid major compaction during restart?

Posted by Mingliang LIU <li...@apache.org>.

Marcell,

In Hadoop side, the NameNode (NN) will not schedule block re-replication
unless the DataNode (DN) has been claimed "dead". By default the interval
is >10mins. Usually your DN should have restarted before being "dead" in
NN. If that still is a concern, you can make that interval longer
indirectly via configurations "dfs.namenode.heartbeat.recheck-interval" and
"dfs.heartbeat.interval". The interval is calculated following this code
<https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L290>
.

Thanks,

On Thu, Jun 28, 2018 at 1:02 PM Marcell Ortutay
<mo...@23andme.com.invalid> wrote:

> Er, to I made a mistake in the above question ; the issue is not so much
> the major compaction but rather that during restart (as nodes go up /
> down), Hadoop and HBase attempt to rebalance blocks and regions, causing
> unnecessary movement. So what I'm actually looking for is a way to avoid
> the balancing for the duration of the restart, which would avoid the need
> for major compaction afterwards.
>
> Marcell
>
> On Thu, Jun 28, 2018 at 12:55 PM, Marcell Ortutay <mo...@23andme.com>
> wrote:
>
> > Hi all,
> >
> > I'm interested in ways to avoid a major compaction when restarting all
> the
> > HBase region servers in a cluster (for example, for a version upgrade).
> Are
> > there any recommended techniques for achieving this?
> >
> > Thanks,
> > Marcell
> >
> >
>

Re: How to avoid major compaction during restart?

Posted by Marcell Ortutay <mo...@23andme.com.INVALID>.

Er, to I made a mistake in the above question ; the issue is not so much
the major compaction but rather that during restart (as nodes go up /
down), Hadoop and HBase attempt to rebalance blocks and regions, causing
unnecessary movement. So what I'm actually looking for is a way to avoid
the balancing for the duration of the restart, which would avoid the need
for major compaction afterwards.

Marcell

On Thu, Jun 28, 2018 at 12:55 PM, Marcell Ortutay <mo...@23andme.com>
wrote:

> Hi all,
>
> I'm interested in ways to avoid a major compaction when restarting all the
> HBase region servers in a cluster (for example, for a version upgrade). Are
> there any recommended techniques for achieving this?
>
> Thanks,
> Marcell
>
>