You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by Patai Sangbutsarakum <si...@gmail.com> on 2013/01/07 21:48:18 UTC

balancer and under replication

Hello Hadoopers,

Currently my production cluster which is running cdh3u4 has shown
Number of Under-Replicated Blocks around 1k blocks.
Even though we have balancer run every night somehow the number of
under replicate is never go down at all.
The question is how HDFS handles under-replication blocks.
- will namenode takes care when file that has under-replicated blocks
is being used ?
or
- we need to bump up setrep to kind of trigger the number of replication block ?
or
- ??

Thanks
-P

Re: balancer and under replication

Posted by Patai Sangbutsarakum <si...@gmail.com>.

Thanks Harsh, you're the first, as usual.
Currently, there are 291 active nodes spread in 11 racks. We have
rack-awareness enable for a year (work like a champ).

I ran fsck throughout HDFS again and noticed that majority of the
files that have under repl. blocks
are set to have 255 replicas but actually it can have only 202 replica.

the maximum replica i saw was in a file that is set for 255 replica
and it got 217 replica.

another 4-5 files are complaining "Replica placement policy is
violated for blk_-6514473498006714669_245767971. Block should be
additionally replicated on 1 more rack(s)."

I tried to setrep +1, -1 trick. Still NN didn't try to pump the number
of repl. this time.

So, question would be with 291 nodes in 11 racks would that feasible
to have 255 copies while compliance to rack awareness?

Hope this make sense
-P

On Mon, Jan 7, 2013 at 1:16 PM, Harsh J <ha...@cloudera.com> wrote:
> Under normal operation, NN takes care of under-replicated blocks by itself.
>
> A file with a replication factor set higher than the cluster's nodes
> will also register its blocks as under-replicated. A common config
> mistake here is the mapred.submit.replication, which is a default of
> 10 (useful for 100 nodes but not otherwise), and you can verify via
> fsck if these affected files are all staging directory MR files which
> fall under this category. If so, just lowering their setrep will help.
>
> If not the above, there's a chance that a rack misconfig may have
> caused a bad state of replication (a violation of policy), which can
> be fixed by the raise and subsequent lowering of the replication
> factor as you state.
>
> On Tue, Jan 8, 2013 at 2:18 AM, Patai Sangbutsarakum
> <si...@gmail.com> wrote:
>> Hello Hadoopers,
>>
>> Currently my production cluster which is running cdh3u4 has shown
>> Number of Under-Replicated Blocks around 1k blocks.
>> Even though we have balancer run every night somehow the number of
>> under replicate is never go down at all.
>> The question is how HDFS handles under-replication blocks.
>> - will namenode takes care when file that has under-replicated blocks
>> is being used ?
>> or
>> - we need to bump up setrep to kind of trigger the number of replication block ?
>> or
>> - ??
>>
>> Thanks
>> -P
>
>
>
> --
> Harsh J

Re: balancer and under replication

Posted by al...@aim.com.

Are you sure  the balancer does anything? I have about 500 missing replicas and 60  Under-replicated blocks and when I start balancer it does not do anything.
The balancer outputs two lines

 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 0 over utilized nodes:
 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 0 under utilized nodes:

and shuts down with no errors.

Thanks.
Alex.

-----Original Message-----
From: Harsh J <ha...@cloudera.com>
To: <us...@hadoop.apache.org> <us...@hadoop.apache.org>
Sent: Mon, Jan 7, 2013 1:17 pm
Subject: Re: balancer and under replication

Under normal operation, NN takes care of under-replicated blocks by itself.

A file with a replication factor set higher than the cluster's nodes
will also register its blocks as under-replicated. A common config
mistake here is the mapred.submit.replication, which is a default of
10 (useful for 100 nodes but not otherwise), and you can verify via
fsck if these affected files are all staging directory MR files which
fall under this category. If so, just lowering their setrep will help.

If not the above, there's a chance that a rack misconfig may have
caused a bad state of replication (a violation of policy), which can
be fixed by the raise and subsequent lowering of the replication
factor as you state.

On Tue, Jan 8, 2013 at 2:18 AM, Patai Sangbutsarakum
<si...@gmail.com> wrote:
> Hello Hadoopers,
>
> Currently my production cluster which is running cdh3u4 has shown
> Number of Under-Replicated Blocks around 1k blocks.
> Even though we have balancer run every night somehow the number of
> under replicate is never go down at all.
> The question is how HDFS handles under-replication blocks.
> - will namenode takes care when file that has under-replicated blocks
> is being used ?
> or
> - we need to bump up setrep to kind of trigger the number of replication block 
?
> or
> - ??
>
> Thanks
> -P

-- 
Harsh J

Re: balancer and under replication

Posted by Patai Sangbutsarakum <si...@gmail.com>.

Thanks Harsh, you're the first, as usual.
Currently, there are 291 active nodes spread in 11 racks. We have
rack-awareness enable for a year (work like a champ).

I ran fsck throughout HDFS again and noticed that majority of the
files that have under repl. blocks
are set to have 255 replicas but actually it can have only 202 replica.

the maximum replica i saw was in a file that is set for 255 replica
and it got 217 replica.

another 4-5 files are complaining "Replica placement policy is
violated for blk_-6514473498006714669_245767971. Block should be
additionally replicated on 1 more rack(s)."

I tried to setrep +1, -1 trick. Still NN didn't try to pump the number
of repl. this time.

So, question would be with 291 nodes in 11 racks would that feasible
to have 255 copies while compliance to rack awareness?

Hope this make sense
-P

On Mon, Jan 7, 2013 at 1:16 PM, Harsh J <ha...@cloudera.com> wrote:
> Under normal operation, NN takes care of under-replicated blocks by itself.
>
> A file with a replication factor set higher than the cluster's nodes
> will also register its blocks as under-replicated. A common config
> mistake here is the mapred.submit.replication, which is a default of
> 10 (useful for 100 nodes but not otherwise), and you can verify via
> fsck if these affected files are all staging directory MR files which
> fall under this category. If so, just lowering their setrep will help.
>
> If not the above, there's a chance that a rack misconfig may have
> caused a bad state of replication (a violation of policy), which can
> be fixed by the raise and subsequent lowering of the replication
> factor as you state.
>
> On Tue, Jan 8, 2013 at 2:18 AM, Patai Sangbutsarakum
> <si...@gmail.com> wrote:
>> Hello Hadoopers,
>>
>> Currently my production cluster which is running cdh3u4 has shown
>> Number of Under-Replicated Blocks around 1k blocks.
>> Even though we have balancer run every night somehow the number of
>> under replicate is never go down at all.
>> The question is how HDFS handles under-replication blocks.
>> - will namenode takes care when file that has under-replicated blocks
>> is being used ?
>> or
>> - we need to bump up setrep to kind of trigger the number of replication block ?
>> or
>> - ??
>>
>> Thanks
>> -P
>
>
>
> --
> Harsh J

Re: balancer and under replication

Posted by Patai Sangbutsarakum <si...@gmail.com>.

Thanks Harsh, you're the first, as usual.
Currently, there are 291 active nodes spread in 11 racks. We have
rack-awareness enable for a year (work like a champ).

I ran fsck throughout HDFS again and noticed that majority of the
files that have under repl. blocks
are set to have 255 replicas but actually it can have only 202 replica.

the maximum replica i saw was in a file that is set for 255 replica
and it got 217 replica.

another 4-5 files are complaining "Replica placement policy is
violated for blk_-6514473498006714669_245767971. Block should be
additionally replicated on 1 more rack(s)."

I tried to setrep +1, -1 trick. Still NN didn't try to pump the number
of repl. this time.

So, question would be with 291 nodes in 11 racks would that feasible
to have 255 copies while compliance to rack awareness?

Hope this make sense
-P

On Mon, Jan 7, 2013 at 1:16 PM, Harsh J <ha...@cloudera.com> wrote:
> Under normal operation, NN takes care of under-replicated blocks by itself.
>
> A file with a replication factor set higher than the cluster's nodes
> will also register its blocks as under-replicated. A common config
> mistake here is the mapred.submit.replication, which is a default of
> 10 (useful for 100 nodes but not otherwise), and you can verify via
> fsck if these affected files are all staging directory MR files which
> fall under this category. If so, just lowering their setrep will help.
>
> If not the above, there's a chance that a rack misconfig may have
> caused a bad state of replication (a violation of policy), which can
> be fixed by the raise and subsequent lowering of the replication
> factor as you state.
>
> On Tue, Jan 8, 2013 at 2:18 AM, Patai Sangbutsarakum
> <si...@gmail.com> wrote:
>> Hello Hadoopers,
>>
>> Currently my production cluster which is running cdh3u4 has shown
>> Number of Under-Replicated Blocks around 1k blocks.
>> Even though we have balancer run every night somehow the number of
>> under replicate is never go down at all.
>> The question is how HDFS handles under-replication blocks.
>> - will namenode takes care when file that has under-replicated blocks
>> is being used ?
>> or
>> - we need to bump up setrep to kind of trigger the number of replication block ?
>> or
>> - ??
>>
>> Thanks
>> -P
>
>
>
> --
> Harsh J

Re: balancer and under replication

Posted by al...@aim.com.

Are you sure  the balancer does anything? I have about 500 missing replicas and 60  Under-replicated blocks and when I start balancer it does not do anything.
The balancer outputs two lines

 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 0 over utilized nodes:
 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 0 under utilized nodes:

and shuts down with no errors.

Thanks.
Alex.

-----Original Message-----
From: Harsh J <ha...@cloudera.com>
To: <us...@hadoop.apache.org> <us...@hadoop.apache.org>
Sent: Mon, Jan 7, 2013 1:17 pm
Subject: Re: balancer and under replication

Under normal operation, NN takes care of under-replicated blocks by itself.

A file with a replication factor set higher than the cluster's nodes
will also register its blocks as under-replicated. A common config
mistake here is the mapred.submit.replication, which is a default of
10 (useful for 100 nodes but not otherwise), and you can verify via
fsck if these affected files are all staging directory MR files which
fall under this category. If so, just lowering their setrep will help.

If not the above, there's a chance that a rack misconfig may have
caused a bad state of replication (a violation of policy), which can
be fixed by the raise and subsequent lowering of the replication
factor as you state.

On Tue, Jan 8, 2013 at 2:18 AM, Patai Sangbutsarakum
<si...@gmail.com> wrote:
> Hello Hadoopers,
>
> Currently my production cluster which is running cdh3u4 has shown
> Number of Under-Replicated Blocks around 1k blocks.
> Even though we have balancer run every night somehow the number of
> under replicate is never go down at all.
> The question is how HDFS handles under-replication blocks.
> - will namenode takes care when file that has under-replicated blocks
> is being used ?
> or
> - we need to bump up setrep to kind of trigger the number of replication block 
?
> or
> - ??
>
> Thanks
> -P

-- 
Harsh J

Re: balancer and under replication

Posted by al...@aim.com.

Are you sure  the balancer does anything? I have about 500 missing replicas and 60  Under-replicated blocks and when I start balancer it does not do anything.
The balancer outputs two lines

 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 0 over utilized nodes:
 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 0 under utilized nodes:

and shuts down with no errors.

Thanks.
Alex.

-----Original Message-----
From: Harsh J <ha...@cloudera.com>
To: <us...@hadoop.apache.org> <us...@hadoop.apache.org>
Sent: Mon, Jan 7, 2013 1:17 pm
Subject: Re: balancer and under replication

Under normal operation, NN takes care of under-replicated blocks by itself.

A file with a replication factor set higher than the cluster's nodes
will also register its blocks as under-replicated. A common config
mistake here is the mapred.submit.replication, which is a default of
10 (useful for 100 nodes but not otherwise), and you can verify via
fsck if these affected files are all staging directory MR files which
fall under this category. If so, just lowering their setrep will help.

If not the above, there's a chance that a rack misconfig may have
caused a bad state of replication (a violation of policy), which can
be fixed by the raise and subsequent lowering of the replication
factor as you state.

On Tue, Jan 8, 2013 at 2:18 AM, Patai Sangbutsarakum
<si...@gmail.com> wrote:
> Hello Hadoopers,
>
> Currently my production cluster which is running cdh3u4 has shown
> Number of Under-Replicated Blocks around 1k blocks.
> Even though we have balancer run every night somehow the number of
> under replicate is never go down at all.
> The question is how HDFS handles under-replication blocks.
> - will namenode takes care when file that has under-replicated blocks
> is being used ?
> or
> - we need to bump up setrep to kind of trigger the number of replication block 
?
> or
> - ??
>
> Thanks
> -P

-- 
Harsh J

Re: balancer and under replication

Posted by Patai Sangbutsarakum <si...@gmail.com>.

Thanks Harsh, you're the first, as usual.
Currently, there are 291 active nodes spread in 11 racks. We have
rack-awareness enable for a year (work like a champ).

I ran fsck throughout HDFS again and noticed that majority of the
files that have under repl. blocks
are set to have 255 replicas but actually it can have only 202 replica.

the maximum replica i saw was in a file that is set for 255 replica
and it got 217 replica.

another 4-5 files are complaining "Replica placement policy is
violated for blk_-6514473498006714669_245767971. Block should be
additionally replicated on 1 more rack(s)."

I tried to setrep +1, -1 trick. Still NN didn't try to pump the number
of repl. this time.

So, question would be with 291 nodes in 11 racks would that feasible
to have 255 copies while compliance to rack awareness?

Hope this make sense
-P

On Mon, Jan 7, 2013 at 1:16 PM, Harsh J <ha...@cloudera.com> wrote:
> Under normal operation, NN takes care of under-replicated blocks by itself.
>
> A file with a replication factor set higher than the cluster's nodes
> will also register its blocks as under-replicated. A common config
> mistake here is the mapred.submit.replication, which is a default of
> 10 (useful for 100 nodes but not otherwise), and you can verify via
> fsck if these affected files are all staging directory MR files which
> fall under this category. If so, just lowering their setrep will help.
>
> If not the above, there's a chance that a rack misconfig may have
> caused a bad state of replication (a violation of policy), which can
> be fixed by the raise and subsequent lowering of the replication
> factor as you state.
>
> On Tue, Jan 8, 2013 at 2:18 AM, Patai Sangbutsarakum
> <si...@gmail.com> wrote:
>> Hello Hadoopers,
>>
>> Currently my production cluster which is running cdh3u4 has shown
>> Number of Under-Replicated Blocks around 1k blocks.
>> Even though we have balancer run every night somehow the number of
>> under replicate is never go down at all.
>> The question is how HDFS handles under-replication blocks.
>> - will namenode takes care when file that has under-replicated blocks
>> is being used ?
>> or
>> - we need to bump up setrep to kind of trigger the number of replication block ?
>> or
>> - ??
>>
>> Thanks
>> -P
>
>
>
> --
> Harsh J

Re: balancer and under replication

Posted by al...@aim.com.

Are you sure  the balancer does anything? I have about 500 missing replicas and 60  Under-replicated blocks and when I start balancer it does not do anything.
The balancer outputs two lines

 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 0 over utilized nodes:
 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 0 under utilized nodes:

and shuts down with no errors.

Thanks.
Alex.

-----Original Message-----
From: Harsh J <ha...@cloudera.com>
To: <us...@hadoop.apache.org> <us...@hadoop.apache.org>
Sent: Mon, Jan 7, 2013 1:17 pm
Subject: Re: balancer and under replication

Under normal operation, NN takes care of under-replicated blocks by itself.

A file with a replication factor set higher than the cluster's nodes
will also register its blocks as under-replicated. A common config
mistake here is the mapred.submit.replication, which is a default of
10 (useful for 100 nodes but not otherwise), and you can verify via
fsck if these affected files are all staging directory MR files which
fall under this category. If so, just lowering their setrep will help.

If not the above, there's a chance that a rack misconfig may have
caused a bad state of replication (a violation of policy), which can
be fixed by the raise and subsequent lowering of the replication
factor as you state.

On Tue, Jan 8, 2013 at 2:18 AM, Patai Sangbutsarakum
<si...@gmail.com> wrote:
> Hello Hadoopers,
>
> Currently my production cluster which is running cdh3u4 has shown
> Number of Under-Replicated Blocks around 1k blocks.
> Even though we have balancer run every night somehow the number of
> under replicate is never go down at all.
> The question is how HDFS handles under-replication blocks.
> - will namenode takes care when file that has under-replicated blocks
> is being used ?
> or
> - we need to bump up setrep to kind of trigger the number of replication block 
?
> or
> - ??
>
> Thanks
> -P

-- 
Harsh J

Re: balancer and under replication

Posted by Harsh J <ha...@cloudera.com>.

Under normal operation, NN takes care of under-replicated blocks by itself.

A file with a replication factor set higher than the cluster's nodes
will also register its blocks as under-replicated. A common config
mistake here is the mapred.submit.replication, which is a default of
10 (useful for 100 nodes but not otherwise), and you can verify via
fsck if these affected files are all staging directory MR files which
fall under this category. If so, just lowering their setrep will help.

If not the above, there's a chance that a rack misconfig may have
caused a bad state of replication (a violation of policy), which can
be fixed by the raise and subsequent lowering of the replication
factor as you state.

On Tue, Jan 8, 2013 at 2:18 AM, Patai Sangbutsarakum
<si...@gmail.com> wrote:
> Hello Hadoopers,
>
> Currently my production cluster which is running cdh3u4 has shown
> Number of Under-Replicated Blocks around 1k blocks.
> Even though we have balancer run every night somehow the number of
> under replicate is never go down at all.
> The question is how HDFS handles under-replication blocks.
> - will namenode takes care when file that has under-replicated blocks
> is being used ?
> or
> - we need to bump up setrep to kind of trigger the number of replication block ?
> or
> - ??
>
> Thanks
> -P

-- 
Harsh J

Re: balancer and under replication

Posted by Harsh J <ha...@cloudera.com>.

Under normal operation, NN takes care of under-replicated blocks by itself.

A file with a replication factor set higher than the cluster's nodes
will also register its blocks as under-replicated. A common config
mistake here is the mapred.submit.replication, which is a default of
10 (useful for 100 nodes but not otherwise), and you can verify via
fsck if these affected files are all staging directory MR files which
fall under this category. If so, just lowering their setrep will help.

If not the above, there's a chance that a rack misconfig may have
caused a bad state of replication (a violation of policy), which can
be fixed by the raise and subsequent lowering of the replication
factor as you state.

On Tue, Jan 8, 2013 at 2:18 AM, Patai Sangbutsarakum
<si...@gmail.com> wrote:
> Hello Hadoopers,
>
> Currently my production cluster which is running cdh3u4 has shown
> Number of Under-Replicated Blocks around 1k blocks.
> Even though we have balancer run every night somehow the number of
> under replicate is never go down at all.
> The question is how HDFS handles under-replication blocks.
> - will namenode takes care when file that has under-replicated blocks
> is being used ?
> or
> - we need to bump up setrep to kind of trigger the number of replication block ?
> or
> - ??
>
> Thanks
> -P

-- 
Harsh J

Re: balancer and under replication

Posted by Harsh J <ha...@cloudera.com>.

Under normal operation, NN takes care of under-replicated blocks by itself.

A file with a replication factor set higher than the cluster's nodes
will also register its blocks as under-replicated. A common config
mistake here is the mapred.submit.replication, which is a default of
10 (useful for 100 nodes but not otherwise), and you can verify via
fsck if these affected files are all staging directory MR files which
fall under this category. If so, just lowering their setrep will help.

If not the above, there's a chance that a rack misconfig may have
caused a bad state of replication (a violation of policy), which can
be fixed by the raise and subsequent lowering of the replication
factor as you state.

On Tue, Jan 8, 2013 at 2:18 AM, Patai Sangbutsarakum
<si...@gmail.com> wrote:
> Hello Hadoopers,
>
> Currently my production cluster which is running cdh3u4 has shown
> Number of Under-Replicated Blocks around 1k blocks.
> Even though we have balancer run every night somehow the number of
> under replicate is never go down at all.
> The question is how HDFS handles under-replication blocks.
> - will namenode takes care when file that has under-replicated blocks
> is being used ?
> or
> - we need to bump up setrep to kind of trigger the number of replication block ?
> or
> - ??
>
> Thanks
> -P

-- 
Harsh J

Re: balancer and under replication

Posted by Harsh J <ha...@cloudera.com>.

Under normal operation, NN takes care of under-replicated blocks by itself.

A file with a replication factor set higher than the cluster's nodes
will also register its blocks as under-replicated. A common config
mistake here is the mapred.submit.replication, which is a default of
10 (useful for 100 nodes but not otherwise), and you can verify via
fsck if these affected files are all staging directory MR files which
fall under this category. If so, just lowering their setrep will help.

If not the above, there's a chance that a rack misconfig may have
caused a bad state of replication (a violation of policy), which can
be fixed by the raise and subsequent lowering of the replication
factor as you state.

On Tue, Jan 8, 2013 at 2:18 AM, Patai Sangbutsarakum
<si...@gmail.com> wrote:
> Hello Hadoopers,
>
> Currently my production cluster which is running cdh3u4 has shown
> Number of Under-Replicated Blocks around 1k blocks.
> Even though we have balancer run every night somehow the number of
> under replicate is never go down at all.
> The question is how HDFS handles under-replication blocks.
> - will namenode takes care when file that has under-replicated blocks
> is being used ?
> or
> - we need to bump up setrep to kind of trigger the number of replication block ?
> or
> - ??
>
> Thanks
> -P

-- 
Harsh J