You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Azuryy Yu <az...@gmail.com> on 2013/07/08 05:40:30 UTC

Can I move block data directly?

Hi Dear all,

There are some unbalanced data nodes in my cluster, some nodes reached more
than 95% disk usage.

so Can I move some block data from one node to another node directly?

such as: from n1 to n2:

1) scp /data/xxxx/blk_*   n2:/data/subdir11/
2) rm -rf data/xxxx/blk_*
3) hadoop-dameon.sh stop datanode (on n1)
4) hadoop-damon.sh start datanode(on n1)
5) hadoop-dameon.sh stop datanode (on n2)
6) hadoop-damon.sh start datanode(on n2)

Am I right? Thanks for any input.

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
Yeah. I got it. Thanks Harsh.


On Mon, Jul 8, 2013 at 3:10 PM, Harsh J <ha...@cloudera.com> wrote:

> Yeah you're right. I only meant the ownership of the blk_* files to be
> owned by the same user as the DN daemon, for consistency more than
> anything else.
>
> On Mon, Jul 8, 2013 at 11:46 AM, Azuryy Yu <az...@gmail.com> wrote:
> > bq. I'd also ensure the ownership of the block files are intact.
> > Hi Harsh,
> >
> > what's mean "ensure the ownership of the block files are intact"?
> >
> > and I want to ask more.
> >
> > for my understand, after I restart the data node daemon, block report
> should
> > tell NN all blocks owned by this DN, and block scnaner can remember all
> > blocks' structure on local, so the block file owner ship would be
> confirmed
> > at the starting period
> >
> > and even if some pieces of blk_ files losts, then NN an find it's under
> > replicated, am I right?  Thanks.
> >
> >
> >
> >
> > On Mon, Jul 8, 2013 at 2:07 PM, Azuryy Yu <az...@gmail.com> wrote:
> >>
> >> Thanks Harsh, always detailed answers each time.
> >>
> >> Yes, this is an unsupported scenarios, Load balancer is very slow even
> >> after I set bandwidthPerSec  to a large value, so I want to take this
> way to
> >> slove the problem quickly.
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>
> >> wrote:
> >>>
> >>> Out of curiosity. Besides the bandwidthPerSec and threshold what other
> >>> parameters are tuneable ?
> >>>
> >>> Thanks,
> >>> Viral
> >>>
> >>> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
> >>>>
> >>>> If balancer isn't cutting it out for you with stock defaults, you
> >>>> should consider tuning that than doing these unsupported scenarios.
> >>>
> >>>
> >>>
> >>
> >
>
>
>
> --
> Harsh J
>

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
Yeah. I got it. Thanks Harsh.


On Mon, Jul 8, 2013 at 3:10 PM, Harsh J <ha...@cloudera.com> wrote:

> Yeah you're right. I only meant the ownership of the blk_* files to be
> owned by the same user as the DN daemon, for consistency more than
> anything else.
>
> On Mon, Jul 8, 2013 at 11:46 AM, Azuryy Yu <az...@gmail.com> wrote:
> > bq. I'd also ensure the ownership of the block files are intact.
> > Hi Harsh,
> >
> > what's mean "ensure the ownership of the block files are intact"?
> >
> > and I want to ask more.
> >
> > for my understand, after I restart the data node daemon, block report
> should
> > tell NN all blocks owned by this DN, and block scnaner can remember all
> > blocks' structure on local, so the block file owner ship would be
> confirmed
> > at the starting period
> >
> > and even if some pieces of blk_ files losts, then NN an find it's under
> > replicated, am I right?  Thanks.
> >
> >
> >
> >
> > On Mon, Jul 8, 2013 at 2:07 PM, Azuryy Yu <az...@gmail.com> wrote:
> >>
> >> Thanks Harsh, always detailed answers each time.
> >>
> >> Yes, this is an unsupported scenarios, Load balancer is very slow even
> >> after I set bandwidthPerSec  to a large value, so I want to take this
> way to
> >> slove the problem quickly.
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>
> >> wrote:
> >>>
> >>> Out of curiosity. Besides the bandwidthPerSec and threshold what other
> >>> parameters are tuneable ?
> >>>
> >>> Thanks,
> >>> Viral
> >>>
> >>> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
> >>>>
> >>>> If balancer isn't cutting it out for you with stock defaults, you
> >>>> should consider tuning that than doing these unsupported scenarios.
> >>>
> >>>
> >>>
> >>
> >
>
>
>
> --
> Harsh J
>

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
Yeah. I got it. Thanks Harsh.


On Mon, Jul 8, 2013 at 3:10 PM, Harsh J <ha...@cloudera.com> wrote:

> Yeah you're right. I only meant the ownership of the blk_* files to be
> owned by the same user as the DN daemon, for consistency more than
> anything else.
>
> On Mon, Jul 8, 2013 at 11:46 AM, Azuryy Yu <az...@gmail.com> wrote:
> > bq. I'd also ensure the ownership of the block files are intact.
> > Hi Harsh,
> >
> > what's mean "ensure the ownership of the block files are intact"?
> >
> > and I want to ask more.
> >
> > for my understand, after I restart the data node daemon, block report
> should
> > tell NN all blocks owned by this DN, and block scnaner can remember all
> > blocks' structure on local, so the block file owner ship would be
> confirmed
> > at the starting period
> >
> > and even if some pieces of blk_ files losts, then NN an find it's under
> > replicated, am I right?  Thanks.
> >
> >
> >
> >
> > On Mon, Jul 8, 2013 at 2:07 PM, Azuryy Yu <az...@gmail.com> wrote:
> >>
> >> Thanks Harsh, always detailed answers each time.
> >>
> >> Yes, this is an unsupported scenarios, Load balancer is very slow even
> >> after I set bandwidthPerSec  to a large value, so I want to take this
> way to
> >> slove the problem quickly.
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>
> >> wrote:
> >>>
> >>> Out of curiosity. Besides the bandwidthPerSec and threshold what other
> >>> parameters are tuneable ?
> >>>
> >>> Thanks,
> >>> Viral
> >>>
> >>> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
> >>>>
> >>>> If balancer isn't cutting it out for you with stock defaults, you
> >>>> should consider tuning that than doing these unsupported scenarios.
> >>>
> >>>
> >>>
> >>
> >
>
>
>
> --
> Harsh J
>

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
Yeah. I got it. Thanks Harsh.


On Mon, Jul 8, 2013 at 3:10 PM, Harsh J <ha...@cloudera.com> wrote:

> Yeah you're right. I only meant the ownership of the blk_* files to be
> owned by the same user as the DN daemon, for consistency more than
> anything else.
>
> On Mon, Jul 8, 2013 at 11:46 AM, Azuryy Yu <az...@gmail.com> wrote:
> > bq. I'd also ensure the ownership of the block files are intact.
> > Hi Harsh,
> >
> > what's mean "ensure the ownership of the block files are intact"?
> >
> > and I want to ask more.
> >
> > for my understand, after I restart the data node daemon, block report
> should
> > tell NN all blocks owned by this DN, and block scnaner can remember all
> > blocks' structure on local, so the block file owner ship would be
> confirmed
> > at the starting period
> >
> > and even if some pieces of blk_ files losts, then NN an find it's under
> > replicated, am I right?  Thanks.
> >
> >
> >
> >
> > On Mon, Jul 8, 2013 at 2:07 PM, Azuryy Yu <az...@gmail.com> wrote:
> >>
> >> Thanks Harsh, always detailed answers each time.
> >>
> >> Yes, this is an unsupported scenarios, Load balancer is very slow even
> >> after I set bandwidthPerSec  to a large value, so I want to take this
> way to
> >> slove the problem quickly.
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>
> >> wrote:
> >>>
> >>> Out of curiosity. Besides the bandwidthPerSec and threshold what other
> >>> parameters are tuneable ?
> >>>
> >>> Thanks,
> >>> Viral
> >>>
> >>> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
> >>>>
> >>>> If balancer isn't cutting it out for you with stock defaults, you
> >>>> should consider tuning that than doing these unsupported scenarios.
> >>>
> >>>
> >>>
> >>
> >
>
>
>
> --
> Harsh J
>

Re: Can I move block data directly?

Posted by Harsh J <ha...@cloudera.com>.
Yeah you're right. I only meant the ownership of the blk_* files to be
owned by the same user as the DN daemon, for consistency more than
anything else.

On Mon, Jul 8, 2013 at 11:46 AM, Azuryy Yu <az...@gmail.com> wrote:
> bq. I'd also ensure the ownership of the block files are intact.
> Hi Harsh,
>
> what's mean "ensure the ownership of the block files are intact"?
>
> and I want to ask more.
>
> for my understand, after I restart the data node daemon, block report should
> tell NN all blocks owned by this DN, and block scnaner can remember all
> blocks' structure on local, so the block file owner ship would be confirmed
> at the starting period
>
> and even if some pieces of blk_ files losts, then NN an find it's under
> replicated, am I right?  Thanks.
>
>
>
>
> On Mon, Jul 8, 2013 at 2:07 PM, Azuryy Yu <az...@gmail.com> wrote:
>>
>> Thanks Harsh, always detailed answers each time.
>>
>> Yes, this is an unsupported scenarios, Load balancer is very slow even
>> after I set bandwidthPerSec  to a large value, so I want to take this way to
>> slove the problem quickly.
>>
>>
>>
>>
>>
>>
>> On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>
>> wrote:
>>>
>>> Out of curiosity. Besides the bandwidthPerSec and threshold what other
>>> parameters are tuneable ?
>>>
>>> Thanks,
>>> Viral
>>>
>>> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>>>>
>>>> If balancer isn't cutting it out for you with stock defaults, you
>>>> should consider tuning that than doing these unsupported scenarios.
>>>
>>>
>>>
>>
>



-- 
Harsh J

Re: Can I move block data directly?

Posted by Harsh J <ha...@cloudera.com>.
Yeah you're right. I only meant the ownership of the blk_* files to be
owned by the same user as the DN daemon, for consistency more than
anything else.

On Mon, Jul 8, 2013 at 11:46 AM, Azuryy Yu <az...@gmail.com> wrote:
> bq. I'd also ensure the ownership of the block files are intact.
> Hi Harsh,
>
> what's mean "ensure the ownership of the block files are intact"?
>
> and I want to ask more.
>
> for my understand, after I restart the data node daemon, block report should
> tell NN all blocks owned by this DN, and block scnaner can remember all
> blocks' structure on local, so the block file owner ship would be confirmed
> at the starting period
>
> and even if some pieces of blk_ files losts, then NN an find it's under
> replicated, am I right?  Thanks.
>
>
>
>
> On Mon, Jul 8, 2013 at 2:07 PM, Azuryy Yu <az...@gmail.com> wrote:
>>
>> Thanks Harsh, always detailed answers each time.
>>
>> Yes, this is an unsupported scenarios, Load balancer is very slow even
>> after I set bandwidthPerSec  to a large value, so I want to take this way to
>> slove the problem quickly.
>>
>>
>>
>>
>>
>>
>> On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>
>> wrote:
>>>
>>> Out of curiosity. Besides the bandwidthPerSec and threshold what other
>>> parameters are tuneable ?
>>>
>>> Thanks,
>>> Viral
>>>
>>> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>>>>
>>>> If balancer isn't cutting it out for you with stock defaults, you
>>>> should consider tuning that than doing these unsupported scenarios.
>>>
>>>
>>>
>>
>



-- 
Harsh J

Re: Can I move block data directly?

Posted by Harsh J <ha...@cloudera.com>.
Yeah you're right. I only meant the ownership of the blk_* files to be
owned by the same user as the DN daemon, for consistency more than
anything else.

On Mon, Jul 8, 2013 at 11:46 AM, Azuryy Yu <az...@gmail.com> wrote:
> bq. I'd also ensure the ownership of the block files are intact.
> Hi Harsh,
>
> what's mean "ensure the ownership of the block files are intact"?
>
> and I want to ask more.
>
> for my understand, after I restart the data node daemon, block report should
> tell NN all blocks owned by this DN, and block scnaner can remember all
> blocks' structure on local, so the block file owner ship would be confirmed
> at the starting period
>
> and even if some pieces of blk_ files losts, then NN an find it's under
> replicated, am I right?  Thanks.
>
>
>
>
> On Mon, Jul 8, 2013 at 2:07 PM, Azuryy Yu <az...@gmail.com> wrote:
>>
>> Thanks Harsh, always detailed answers each time.
>>
>> Yes, this is an unsupported scenarios, Load balancer is very slow even
>> after I set bandwidthPerSec  to a large value, so I want to take this way to
>> slove the problem quickly.
>>
>>
>>
>>
>>
>>
>> On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>
>> wrote:
>>>
>>> Out of curiosity. Besides the bandwidthPerSec and threshold what other
>>> parameters are tuneable ?
>>>
>>> Thanks,
>>> Viral
>>>
>>> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>>>>
>>>> If balancer isn't cutting it out for you with stock defaults, you
>>>> should consider tuning that than doing these unsupported scenarios.
>>>
>>>
>>>
>>
>



-- 
Harsh J

Re: Can I move block data directly?

Posted by Harsh J <ha...@cloudera.com>.
Yeah you're right. I only meant the ownership of the blk_* files to be
owned by the same user as the DN daemon, for consistency more than
anything else.

On Mon, Jul 8, 2013 at 11:46 AM, Azuryy Yu <az...@gmail.com> wrote:
> bq. I'd also ensure the ownership of the block files are intact.
> Hi Harsh,
>
> what's mean "ensure the ownership of the block files are intact"?
>
> and I want to ask more.
>
> for my understand, after I restart the data node daemon, block report should
> tell NN all blocks owned by this DN, and block scnaner can remember all
> blocks' structure on local, so the block file owner ship would be confirmed
> at the starting period
>
> and even if some pieces of blk_ files losts, then NN an find it's under
> replicated, am I right?  Thanks.
>
>
>
>
> On Mon, Jul 8, 2013 at 2:07 PM, Azuryy Yu <az...@gmail.com> wrote:
>>
>> Thanks Harsh, always detailed answers each time.
>>
>> Yes, this is an unsupported scenarios, Load balancer is very slow even
>> after I set bandwidthPerSec  to a large value, so I want to take this way to
>> slove the problem quickly.
>>
>>
>>
>>
>>
>>
>> On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>
>> wrote:
>>>
>>> Out of curiosity. Besides the bandwidthPerSec and threshold what other
>>> parameters are tuneable ?
>>>
>>> Thanks,
>>> Viral
>>>
>>> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>>>>
>>>> If balancer isn't cutting it out for you with stock defaults, you
>>>> should consider tuning that than doing these unsupported scenarios.
>>>
>>>
>>>
>>
>



-- 
Harsh J

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
bq. I'd also ensure the ownership of the block files are intact.
Hi Harsh,

what's mean "ensure the ownership of the block files are intact"?

and I want to ask more.

for my understand, after I restart the data node daemon, block report
should tell NN all blocks owned by this DN, and block scnaner can remember
all blocks' structure on local, so the block file owner ship would be
confirmed at the starting period

and even if some pieces of blk_ files losts, then NN an find it's under
replicated, am I right?  Thanks.




On Mon, Jul 8, 2013 at 2:07 PM, Azuryy Yu <az...@gmail.com> wrote:

> Thanks Harsh, always detailed answers each time.
>
> Yes, this is an unsupported scenarios, Load balancer is very slow even
> after I set bandwidthPerSec  to a large value, so I want to take this way
> to slove the problem quickly.
>
>
>
>
>
>
> On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>wrote:
>
>> Out of curiosity. Besides the bandwidthPerSec and threshold what other
>> parameters are tuneable ?
>>
>> Thanks,
>> Viral
>>
>> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> If balancer isn't cutting it out for you with stock defaults, you
>>> should consider tuning that than doing these unsupported scenarios.
>>>
>>
>>
>>
>

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
bq. I'd also ensure the ownership of the block files are intact.
Hi Harsh,

what's mean "ensure the ownership of the block files are intact"?

and I want to ask more.

for my understand, after I restart the data node daemon, block report
should tell NN all blocks owned by this DN, and block scnaner can remember
all blocks' structure on local, so the block file owner ship would be
confirmed at the starting period

and even if some pieces of blk_ files losts, then NN an find it's under
replicated, am I right?  Thanks.




On Mon, Jul 8, 2013 at 2:07 PM, Azuryy Yu <az...@gmail.com> wrote:

> Thanks Harsh, always detailed answers each time.
>
> Yes, this is an unsupported scenarios, Load balancer is very slow even
> after I set bandwidthPerSec  to a large value, so I want to take this way
> to slove the problem quickly.
>
>
>
>
>
>
> On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>wrote:
>
>> Out of curiosity. Besides the bandwidthPerSec and threshold what other
>> parameters are tuneable ?
>>
>> Thanks,
>> Viral
>>
>> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> If balancer isn't cutting it out for you with stock defaults, you
>>> should consider tuning that than doing these unsupported scenarios.
>>>
>>
>>
>>
>

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
bq. I'd also ensure the ownership of the block files are intact.
Hi Harsh,

what's mean "ensure the ownership of the block files are intact"?

and I want to ask more.

for my understand, after I restart the data node daemon, block report
should tell NN all blocks owned by this DN, and block scnaner can remember
all blocks' structure on local, so the block file owner ship would be
confirmed at the starting period

and even if some pieces of blk_ files losts, then NN an find it's under
replicated, am I right?  Thanks.




On Mon, Jul 8, 2013 at 2:07 PM, Azuryy Yu <az...@gmail.com> wrote:

> Thanks Harsh, always detailed answers each time.
>
> Yes, this is an unsupported scenarios, Load balancer is very slow even
> after I set bandwidthPerSec  to a large value, so I want to take this way
> to slove the problem quickly.
>
>
>
>
>
>
> On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>wrote:
>
>> Out of curiosity. Besides the bandwidthPerSec and threshold what other
>> parameters are tuneable ?
>>
>> Thanks,
>> Viral
>>
>> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> If balancer isn't cutting it out for you with stock defaults, you
>>> should consider tuning that than doing these unsupported scenarios.
>>>
>>
>>
>>
>

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
bq. I'd also ensure the ownership of the block files are intact.
Hi Harsh,

what's mean "ensure the ownership of the block files are intact"?

and I want to ask more.

for my understand, after I restart the data node daemon, block report
should tell NN all blocks owned by this DN, and block scnaner can remember
all blocks' structure on local, so the block file owner ship would be
confirmed at the starting period

and even if some pieces of blk_ files losts, then NN an find it's under
replicated, am I right?  Thanks.




On Mon, Jul 8, 2013 at 2:07 PM, Azuryy Yu <az...@gmail.com> wrote:

> Thanks Harsh, always detailed answers each time.
>
> Yes, this is an unsupported scenarios, Load balancer is very slow even
> after I set bandwidthPerSec  to a large value, so I want to take this way
> to slove the problem quickly.
>
>
>
>
>
>
> On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>wrote:
>
>> Out of curiosity. Besides the bandwidthPerSec and threshold what other
>> parameters are tuneable ?
>>
>> Thanks,
>> Viral
>>
>> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> If balancer isn't cutting it out for you with stock defaults, you
>>> should consider tuning that than doing these unsupported scenarios.
>>>
>>
>>
>>
>

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
Thanks Harsh, always detailed answers each time.

Yes, this is an unsupported scenarios, Load balancer is very slow even
after I set bandwidthPerSec  to a large value, so I want to take this way
to slove the problem quickly.






On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>wrote:

> Out of curiosity. Besides the bandwidthPerSec and threshold what other
> parameters are tuneable ?
>
> Thanks,
> Viral
>
> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> If balancer isn't cutting it out for you with stock defaults, you
>> should consider tuning that than doing these unsupported scenarios.
>>
>
>
>

Re: Can I move block data directly?

Posted by Harsh J <ha...@cloudera.com>.
Viral,

There is an advanced property
dfs.namenode.replication.work.multiplier.per.iteration and a few
others that are also relevant. See the listing at
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
for more details.

On Mon, Jul 8, 2013 at 11:16 AM, Viral Bajaria <vi...@gmail.com> wrote:
> Out of curiosity. Besides the bandwidthPerSec and threshold what other
> parameters are tuneable ?
>
> Thanks,
> Viral
>
> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> If balancer isn't cutting it out for you with stock defaults, you
>> should consider tuning that than doing these unsupported scenarios.
>
>
>



-- 
Harsh J

Re: Can I move block data directly?

Posted by Harsh J <ha...@cloudera.com>.
Viral,

There is an advanced property
dfs.namenode.replication.work.multiplier.per.iteration and a few
others that are also relevant. See the listing at
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
for more details.

On Mon, Jul 8, 2013 at 11:16 AM, Viral Bajaria <vi...@gmail.com> wrote:
> Out of curiosity. Besides the bandwidthPerSec and threshold what other
> parameters are tuneable ?
>
> Thanks,
> Viral
>
> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> If balancer isn't cutting it out for you with stock defaults, you
>> should consider tuning that than doing these unsupported scenarios.
>
>
>



-- 
Harsh J

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
Thanks Harsh, always detailed answers each time.

Yes, this is an unsupported scenarios, Load balancer is very slow even
after I set bandwidthPerSec  to a large value, so I want to take this way
to slove the problem quickly.






On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>wrote:

> Out of curiosity. Besides the bandwidthPerSec and threshold what other
> parameters are tuneable ?
>
> Thanks,
> Viral
>
> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> If balancer isn't cutting it out for you with stock defaults, you
>> should consider tuning that than doing these unsupported scenarios.
>>
>
>
>

Re: Can I move block data directly?

Posted by Harsh J <ha...@cloudera.com>.
Viral,

There is an advanced property
dfs.namenode.replication.work.multiplier.per.iteration and a few
others that are also relevant. See the listing at
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
for more details.

On Mon, Jul 8, 2013 at 11:16 AM, Viral Bajaria <vi...@gmail.com> wrote:
> Out of curiosity. Besides the bandwidthPerSec and threshold what other
> parameters are tuneable ?
>
> Thanks,
> Viral
>
> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> If balancer isn't cutting it out for you with stock defaults, you
>> should consider tuning that than doing these unsupported scenarios.
>
>
>



-- 
Harsh J

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
Thanks Harsh, always detailed answers each time.

Yes, this is an unsupported scenarios, Load balancer is very slow even
after I set bandwidthPerSec  to a large value, so I want to take this way
to slove the problem quickly.






On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>wrote:

> Out of curiosity. Besides the bandwidthPerSec and threshold what other
> parameters are tuneable ?
>
> Thanks,
> Viral
>
> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> If balancer isn't cutting it out for you with stock defaults, you
>> should consider tuning that than doing these unsupported scenarios.
>>
>
>
>

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
Thanks Harsh, always detailed answers each time.

Yes, this is an unsupported scenarios, Load balancer is very slow even
after I set bandwidthPerSec  to a large value, so I want to take this way
to slove the problem quickly.






On Mon, Jul 8, 2013 at 1:46 PM, Viral Bajaria <vi...@gmail.com>wrote:

> Out of curiosity. Besides the bandwidthPerSec and threshold what other
> parameters are tuneable ?
>
> Thanks,
> Viral
>
> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> If balancer isn't cutting it out for you with stock defaults, you
>> should consider tuning that than doing these unsupported scenarios.
>>
>
>
>

Re: Can I move block data directly?

Posted by Harsh J <ha...@cloudera.com>.
Viral,

There is an advanced property
dfs.namenode.replication.work.multiplier.per.iteration and a few
others that are also relevant. See the listing at
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
for more details.

On Mon, Jul 8, 2013 at 11:16 AM, Viral Bajaria <vi...@gmail.com> wrote:
> Out of curiosity. Besides the bandwidthPerSec and threshold what other
> parameters are tuneable ?
>
> Thanks,
> Viral
>
> On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> If balancer isn't cutting it out for you with stock defaults, you
>> should consider tuning that than doing these unsupported scenarios.
>
>
>



-- 
Harsh J

Re: Can I move block data directly?

Posted by Viral Bajaria <vi...@gmail.com>.
Out of curiosity. Besides the bandwidthPerSec and threshold what other
parameters are tuneable ?

Thanks,
Viral

On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:

> If balancer isn't cutting it out for you with stock defaults, you
> should consider tuning that than doing these unsupported scenarios.
>

Re: Can I move block data directly?

Posted by Viral Bajaria <vi...@gmail.com>.
Out of curiosity. Besides the bandwidthPerSec and threshold what other
parameters are tuneable ?

Thanks,
Viral

On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:

> If balancer isn't cutting it out for you with stock defaults, you
> should consider tuning that than doing these unsupported scenarios.
>

Re: Can I move block data directly?

Posted by Viral Bajaria <vi...@gmail.com>.
Out of curiosity. Besides the bandwidthPerSec and threshold what other
parameters are tuneable ?

Thanks,
Viral

On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:

> If balancer isn't cutting it out for you with stock defaults, you
> should consider tuning that than doing these unsupported scenarios.
>

Re: Can I move block data directly?

Posted by Viral Bajaria <vi...@gmail.com>.
Out of curiosity. Besides the bandwidthPerSec and threshold what other
parameters are tuneable ?

Thanks,
Viral

On Sun, Jul 7, 2013 at 10:39 PM, Harsh J <ha...@cloudera.com> wrote:

> If balancer isn't cutting it out for you with stock defaults, you
> should consider tuning that than doing these unsupported scenarios.
>

Re: Can I move block data directly?

Posted by Harsh J <ha...@cloudera.com>.
Yes, you could do this, but I'd place steps 3 and 5 as steps 1 and 2.
I'd also ensure the ownership of the block files are intact.

If balancer isn't cutting it out for you with stock defaults, you
should consider tuning that than doing these unsupported scenarios.

On Mon, Jul 8, 2013 at 9:10 AM, Azuryy Yu <az...@gmail.com> wrote:
> Hi Dear all,
>
> There are some unbalanced data nodes in my cluster, some nodes reached more
> than 95% disk usage.
>
> so Can I move some block data from one node to another node directly?
>
> such as: from n1 to n2:
>
> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
> 2) rm -rf data/xxxx/blk_*
> 3) hadoop-dameon.sh stop datanode (on n1)
> 4) hadoop-damon.sh start datanode(on n1)
> 5) hadoop-dameon.sh stop datanode (on n2)
> 6) hadoop-damon.sh start datanode(on n2)
>
> Am I right? Thanks for any input.
>
>



-- 
Harsh J

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
load balancer is too slow, even if I increased banth width.


On Mon, Jul 8, 2013 at 12:18 PM, kishore alajangi <alajangikishore@gmail.com
> wrote:

> run start-balancer.sh
>
>
> On Mon, Jul 8, 2013 at 9:10 AM, Azuryy Yu <az...@gmail.com> wrote:
>
>> Hi Dear all,
>>
>> There are some unbalanced data nodes in my cluster, some nodes reached
>> more than 95% disk usage.
>>
>> so Can I move some block data from one node to another node directly?
>>
>> such as: from n1 to n2:
>>
>> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
>> 2) rm -rf data/xxxx/blk_*
>> 3) hadoop-dameon.sh stop datanode (on n1)
>> 4) hadoop-damon.sh start datanode(on n1)
>> 5) hadoop-dameon.sh stop datanode (on n2)
>> 6) hadoop-damon.sh start datanode(on n2)
>>
>> Am I right? Thanks for any input.
>>
>>
>>
>
>

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
load balancer is too slow, even if I increased banth width.


On Mon, Jul 8, 2013 at 12:18 PM, kishore alajangi <alajangikishore@gmail.com
> wrote:

> run start-balancer.sh
>
>
> On Mon, Jul 8, 2013 at 9:10 AM, Azuryy Yu <az...@gmail.com> wrote:
>
>> Hi Dear all,
>>
>> There are some unbalanced data nodes in my cluster, some nodes reached
>> more than 95% disk usage.
>>
>> so Can I move some block data from one node to another node directly?
>>
>> such as: from n1 to n2:
>>
>> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
>> 2) rm -rf data/xxxx/blk_*
>> 3) hadoop-dameon.sh stop datanode (on n1)
>> 4) hadoop-damon.sh start datanode(on n1)
>> 5) hadoop-dameon.sh stop datanode (on n2)
>> 6) hadoop-damon.sh start datanode(on n2)
>>
>> Am I right? Thanks for any input.
>>
>>
>>
>
>

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
load balancer is too slow, even if I increased banth width.


On Mon, Jul 8, 2013 at 12:18 PM, kishore alajangi <alajangikishore@gmail.com
> wrote:

> run start-balancer.sh
>
>
> On Mon, Jul 8, 2013 at 9:10 AM, Azuryy Yu <az...@gmail.com> wrote:
>
>> Hi Dear all,
>>
>> There are some unbalanced data nodes in my cluster, some nodes reached
>> more than 95% disk usage.
>>
>> so Can I move some block data from one node to another node directly?
>>
>> such as: from n1 to n2:
>>
>> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
>> 2) rm -rf data/xxxx/blk_*
>> 3) hadoop-dameon.sh stop datanode (on n1)
>> 4) hadoop-damon.sh start datanode(on n1)
>> 5) hadoop-dameon.sh stop datanode (on n2)
>> 6) hadoop-damon.sh start datanode(on n2)
>>
>> Am I right? Thanks for any input.
>>
>>
>>
>
>

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
load balancer is too slow, even if I increased banth width.


On Mon, Jul 8, 2013 at 12:18 PM, kishore alajangi <alajangikishore@gmail.com
> wrote:

> run start-balancer.sh
>
>
> On Mon, Jul 8, 2013 at 9:10 AM, Azuryy Yu <az...@gmail.com> wrote:
>
>> Hi Dear all,
>>
>> There are some unbalanced data nodes in my cluster, some nodes reached
>> more than 95% disk usage.
>>
>> so Can I move some block data from one node to another node directly?
>>
>> such as: from n1 to n2:
>>
>> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
>> 2) rm -rf data/xxxx/blk_*
>> 3) hadoop-dameon.sh stop datanode (on n1)
>> 4) hadoop-damon.sh start datanode(on n1)
>> 5) hadoop-dameon.sh stop datanode (on n2)
>> 6) hadoop-damon.sh start datanode(on n2)
>>
>> Am I right? Thanks for any input.
>>
>>
>>
>
>

Re: Can I move block data directly?

Posted by kishore alajangi <al...@gmail.com>.
run start-balancer.sh


On Mon, Jul 8, 2013 at 9:10 AM, Azuryy Yu <az...@gmail.com> wrote:

> Hi Dear all,
>
> There are some unbalanced data nodes in my cluster, some nodes reached
> more than 95% disk usage.
>
> so Can I move some block data from one node to another node directly?
>
> such as: from n1 to n2:
>
> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
> 2) rm -rf data/xxxx/blk_*
> 3) hadoop-dameon.sh stop datanode (on n1)
> 4) hadoop-damon.sh start datanode(on n1)
> 5) hadoop-dameon.sh stop datanode (on n2)
> 6) hadoop-damon.sh start datanode(on n2)
>
> Am I right? Thanks for any input.
>
>
>

Re: Can I move block data directly?

Posted by Chris Embree <ce...@gmail.com>.
I know nothing.

It seems that circumventing normal operations could be very bad.  There was
an example of something similar at hadoop summit.  Some very experienced
contributors decided they should edit meta data..... they broke their
cluster.

Just say no !  ;)
On Jul 8, 2013 9:01 PM, "Azuryy Yu" <az...@gmail.com> wrote:

> Hi Harsh,
>
> I also do agree with you that this is crude. and balancer is the right way.
> I just want to slove the problem very quickly. and only a few nodes
> involved.
>
>
> Thanks.
>
>
>
>
> On Tue, Jul 9, 2013 at 8:50 AM, Harsh J <ha...@cloudera.com> wrote:
>
> > Eitan,
> >
> > The block to host mapping isn't persisted in the metadata. This is
> > also the reason why the steps include a restart, which will re-trigger
> > a block report (and avoid gotchas) that will update the NN of the new
> > listing at each DN. Thats what makes this method "crude" at the same
> > time - you're leveraging a behavior thats not guaranteed to be
> > unchanged in future.
> >
> > The balancer is the right way to go about it.
> >
> > On Mon, Jul 8, 2013 at 6:53 PM, Eitan Rosenfeld <ei...@gmail.com>
> wrote:
> > > Hi Azurry, I'd also like to be able to manually move blocks.
> > >
> > > One piece that is missing in your current approach is updating any
> > > block mappings that the cluster relies on.
> > > The namenode has a mapping of blocks to datanodes, and the datanode
> > > has, as the comments say, a "block -> stream of bytes" mapping.
> > >
> > > As I understand it, the namenode's mappings have to be updated to
> > > reflect the new block locations.
> > > The datanode might not need intervention, I'm not sure.
> > >
> > > Can anyone else chime in on those areas?
> > >
> > > The balancer that Allan suggested likely demonstrates all of the ins
> > > and outs in order successfully complete a block transfer.
> > > Thus, the balancer is where I'll begin my efforts to learn how to
> > > manually move blocks.
> > >
> > > Any other pointers would be helpful.
> > >
> > > Thank you,
> > > Eitan
> > >
> > > On Mon, Jul 8, 2013 at 2:15 PM, Allan <wi...@gmail.com> wrote:
> > >> If the imbalance is across data nodes then you need to run the
> balancer.
> > >>
> > >> Sent from my iPad
> > >>
> > >> On Jul 8, 2013, at 1:15 AM, Azuryy Yu <az...@gmail.com> wrote:
> > >>
> > >>> Hi Dear all,
> > >>>
> > >>> There are some unbalanced data nodes in my cluster, some nodes
> reached
> > more
> > >>> than 95% disk usage.
> > >>>
> > >>> so Can I move some block data from one node to another node directly?
> > >>>
> > >>> such as: from n1 to n2:
> > >>>
> > >>> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
> > >>> 2) rm -rf data/xxxx/blk_*
> > >>> 3) hadoop-dameon.sh stop datanode (on n1)
> > >>> 4) hadoop-damon.sh start datanode(on n1)
> > >>> 5) hadoop-dameon.sh stop datanode (on n2)
> > >>> 6) hadoop-damon.sh start datanode(on n2)
> > >>>
> > >>> Am I right? Thanks for any inputs.
> >
> >
> >
> > --
> > Harsh J
> >
>

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
Hi Harsh,

I also do agree with you that this is crude. and balancer is the right way.
I just want to slove the problem very quickly. and only a few nodes
involved.


Thanks.




On Tue, Jul 9, 2013 at 8:50 AM, Harsh J <ha...@cloudera.com> wrote:

> Eitan,
>
> The block to host mapping isn't persisted in the metadata. This is
> also the reason why the steps include a restart, which will re-trigger
> a block report (and avoid gotchas) that will update the NN of the new
> listing at each DN. Thats what makes this method "crude" at the same
> time - you're leveraging a behavior thats not guaranteed to be
> unchanged in future.
>
> The balancer is the right way to go about it.
>
> On Mon, Jul 8, 2013 at 6:53 PM, Eitan Rosenfeld <ei...@gmail.com> wrote:
> > Hi Azurry, I'd also like to be able to manually move blocks.
> >
> > One piece that is missing in your current approach is updating any
> > block mappings that the cluster relies on.
> > The namenode has a mapping of blocks to datanodes, and the datanode
> > has, as the comments say, a "block -> stream of bytes" mapping.
> >
> > As I understand it, the namenode's mappings have to be updated to
> > reflect the new block locations.
> > The datanode might not need intervention, I'm not sure.
> >
> > Can anyone else chime in on those areas?
> >
> > The balancer that Allan suggested likely demonstrates all of the ins
> > and outs in order successfully complete a block transfer.
> > Thus, the balancer is where I'll begin my efforts to learn how to
> > manually move blocks.
> >
> > Any other pointers would be helpful.
> >
> > Thank you,
> > Eitan
> >
> > On Mon, Jul 8, 2013 at 2:15 PM, Allan <wi...@gmail.com> wrote:
> >> If the imbalance is across data nodes then you need to run the balancer.
> >>
> >> Sent from my iPad
> >>
> >> On Jul 8, 2013, at 1:15 AM, Azuryy Yu <az...@gmail.com> wrote:
> >>
> >>> Hi Dear all,
> >>>
> >>> There are some unbalanced data nodes in my cluster, some nodes reached
> more
> >>> than 95% disk usage.
> >>>
> >>> so Can I move some block data from one node to another node directly?
> >>>
> >>> such as: from n1 to n2:
> >>>
> >>> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
> >>> 2) rm -rf data/xxxx/blk_*
> >>> 3) hadoop-dameon.sh stop datanode (on n1)
> >>> 4) hadoop-damon.sh start datanode(on n1)
> >>> 5) hadoop-dameon.sh stop datanode (on n2)
> >>> 6) hadoop-damon.sh start datanode(on n2)
> >>>
> >>> Am I right? Thanks for any inputs.
>
>
>
> --
> Harsh J
>

Re: Can I move block data directly?

Posted by Harsh J <ha...@cloudera.com>.
Eitan,

The block to host mapping isn't persisted in the metadata. This is
also the reason why the steps include a restart, which will re-trigger
a block report (and avoid gotchas) that will update the NN of the new
listing at each DN. Thats what makes this method "crude" at the same
time - you're leveraging a behavior thats not guaranteed to be
unchanged in future.

The balancer is the right way to go about it.

On Mon, Jul 8, 2013 at 6:53 PM, Eitan Rosenfeld <ei...@gmail.com> wrote:
> Hi Azurry, I'd also like to be able to manually move blocks.
>
> One piece that is missing in your current approach is updating any
> block mappings that the cluster relies on.
> The namenode has a mapping of blocks to datanodes, and the datanode
> has, as the comments say, a "block -> stream of bytes" mapping.
>
> As I understand it, the namenode's mappings have to be updated to
> reflect the new block locations.
> The datanode might not need intervention, I'm not sure.
>
> Can anyone else chime in on those areas?
>
> The balancer that Allan suggested likely demonstrates all of the ins
> and outs in order successfully complete a block transfer.
> Thus, the balancer is where I'll begin my efforts to learn how to
> manually move blocks.
>
> Any other pointers would be helpful.
>
> Thank you,
> Eitan
>
> On Mon, Jul 8, 2013 at 2:15 PM, Allan <wi...@gmail.com> wrote:
>> If the imbalance is across data nodes then you need to run the balancer.
>>
>> Sent from my iPad
>>
>> On Jul 8, 2013, at 1:15 AM, Azuryy Yu <az...@gmail.com> wrote:
>>
>>> Hi Dear all,
>>>
>>> There are some unbalanced data nodes in my cluster, some nodes reached more
>>> than 95% disk usage.
>>>
>>> so Can I move some block data from one node to another node directly?
>>>
>>> such as: from n1 to n2:
>>>
>>> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
>>> 2) rm -rf data/xxxx/blk_*
>>> 3) hadoop-dameon.sh stop datanode (on n1)
>>> 4) hadoop-damon.sh start datanode(on n1)
>>> 5) hadoop-dameon.sh stop datanode (on n2)
>>> 6) hadoop-damon.sh start datanode(on n2)
>>>
>>> Am I right? Thanks for any inputs.



-- 
Harsh J

Re: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
Hi Eitan,

yup, The namenode has a mapping of blocks to datanodes, which keeping in
memory.
and you are also right, DN also keeps block structure in memory.

but you've noticed, I restart the DN after I moved block data manually.
then during the data node's start,

Block Scanner can scan the block local files, then construct it in the
memory, then, send block report to the NN, when NN received block report
from this data node, NN update <blockID, datanode storageID> mapping.

so I am right and tested it. but as Harsh mentioned, we'd stop both src and
target data node, then move block data manually, then start these two peer
data nodes.



On Mon, Jul 8, 2013 at 9:23 PM, Eitan Rosenfeld <ei...@gmail.com> wrote:

> Hi Azurry, I'd also like to be able to manually move blocks.
>
> One piece that is missing in your current approach is updating any
> block mappings that the cluster relies on.
> The namenode has a mapping of blocks to datanodes, and the datanode
> has, as the comments say, a "block -> stream of bytes" mapping.
>
> As I understand it, the namenode's mappings have to be updated to
> reflect the new block locations.
> The datanode might not need intervention, I'm not sure.
>
> Can anyone else chime in on those areas?
>
> The balancer that Allan suggested likely demonstrates all of the ins
> and outs in order successfully complete a block transfer.
> Thus, the balancer is where I'll begin my efforts to learn how to
> manually move blocks.
>
> Any other pointers would be helpful.
>
> Thank you,
> Eitan
>
> On Mon, Jul 8, 2013 at 2:15 PM, Allan <wi...@gmail.com> wrote:
> > If the imbalance is across data nodes then you need to run the balancer.
> >
> > Sent from my iPad
> >
> > On Jul 8, 2013, at 1:15 AM, Azuryy Yu <az...@gmail.com> wrote:
> >
> >> Hi Dear all,
> >>
> >> There are some unbalanced data nodes in my cluster, some nodes reached
> more
> >> than 95% disk usage.
> >>
> >> so Can I move some block data from one node to another node directly?
> >>
> >> such as: from n1 to n2:
> >>
> >> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
> >> 2) rm -rf data/xxxx/blk_*
> >> 3) hadoop-dameon.sh stop datanode (on n1)
> >> 4) hadoop-damon.sh start datanode(on n1)
> >> 5) hadoop-dameon.sh stop datanode (on n2)
> >> 6) hadoop-damon.sh start datanode(on n2)
> >>
> >> Am I right? Thanks for any inputs.
>

Re: Can I move block data directly?

Posted by Eitan Rosenfeld <ei...@gmail.com>.
Hi Azurry, I'd also like to be able to manually move blocks.

One piece that is missing in your current approach is updating any
block mappings that the cluster relies on.
The namenode has a mapping of blocks to datanodes, and the datanode
has, as the comments say, a "block -> stream of bytes" mapping.

As I understand it, the namenode's mappings have to be updated to
reflect the new block locations.
The datanode might not need intervention, I'm not sure.

Can anyone else chime in on those areas?

The balancer that Allan suggested likely demonstrates all of the ins
and outs in order successfully complete a block transfer.
Thus, the balancer is where I'll begin my efforts to learn how to
manually move blocks.

Any other pointers would be helpful.

Thank you,
Eitan

On Mon, Jul 8, 2013 at 2:15 PM, Allan <wi...@gmail.com> wrote:
> If the imbalance is across data nodes then you need to run the balancer.
>
> Sent from my iPad
>
> On Jul 8, 2013, at 1:15 AM, Azuryy Yu <az...@gmail.com> wrote:
>
>> Hi Dear all,
>>
>> There are some unbalanced data nodes in my cluster, some nodes reached more
>> than 95% disk usage.
>>
>> so Can I move some block data from one node to another node directly?
>>
>> such as: from n1 to n2:
>>
>> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
>> 2) rm -rf data/xxxx/blk_*
>> 3) hadoop-dameon.sh stop datanode (on n1)
>> 4) hadoop-damon.sh start datanode(on n1)
>> 5) hadoop-dameon.sh stop datanode (on n2)
>> 6) hadoop-damon.sh start datanode(on n2)
>>
>> Am I right? Thanks for any inputs.

Re: Can I move block data directly?

Posted by Allan <wi...@gmail.com>.
If the imbalance is across data nodes then you need to run the balancer.

Sent from my iPad

On Jul 8, 2013, at 1:15 AM, Azuryy Yu <az...@gmail.com> wrote:

> Hi Dear all,
> 
> There are some unbalanced data nodes in my cluster, some nodes reached more
> than 95% disk usage.
> 
> so Can I move some block data from one node to another node directly?
> 
> such as: from n1 to n2:
> 
> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
> 2) rm -rf data/xxxx/blk_*
> 3) hadoop-dameon.sh stop datanode (on n1)
> 4) hadoop-damon.sh start datanode(on n1)
> 5) hadoop-dameon.sh stop datanode (on n2)
> 6) hadoop-damon.sh start datanode(on n2)
> 
> Am I right? Thanks for any inputs.

Fwd: Can I move block data directly?

Posted by Azuryy Yu <az...@gmail.com>.
Hi Dear all,

There are some unbalanced data nodes in my cluster, some nodes reached more
than 95% disk usage.

so Can I move some block data from one node to another node directly?

such as: from n1 to n2:

1) scp /data/xxxx/blk_*   n2:/data/subdir11/
2) rm -rf data/xxxx/blk_*
3) hadoop-dameon.sh stop datanode (on n1)
4) hadoop-damon.sh start datanode(on n1)
5) hadoop-dameon.sh stop datanode (on n2)
6) hadoop-damon.sh start datanode(on n2)

Am I right? Thanks for any inputs.

Re: Can I move block data directly?

Posted by kishore alajangi <al...@gmail.com>.
run start-balancer.sh


On Mon, Jul 8, 2013 at 9:10 AM, Azuryy Yu <az...@gmail.com> wrote:

> Hi Dear all,
>
> There are some unbalanced data nodes in my cluster, some nodes reached
> more than 95% disk usage.
>
> so Can I move some block data from one node to another node directly?
>
> such as: from n1 to n2:
>
> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
> 2) rm -rf data/xxxx/blk_*
> 3) hadoop-dameon.sh stop datanode (on n1)
> 4) hadoop-damon.sh start datanode(on n1)
> 5) hadoop-dameon.sh stop datanode (on n2)
> 6) hadoop-damon.sh start datanode(on n2)
>
> Am I right? Thanks for any input.
>
>
>

Re: Can I move block data directly?

Posted by Harsh J <ha...@cloudera.com>.
Yes, you could do this, but I'd place steps 3 and 5 as steps 1 and 2.
I'd also ensure the ownership of the block files are intact.

If balancer isn't cutting it out for you with stock defaults, you
should consider tuning that than doing these unsupported scenarios.

On Mon, Jul 8, 2013 at 9:10 AM, Azuryy Yu <az...@gmail.com> wrote:
> Hi Dear all,
>
> There are some unbalanced data nodes in my cluster, some nodes reached more
> than 95% disk usage.
>
> so Can I move some block data from one node to another node directly?
>
> such as: from n1 to n2:
>
> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
> 2) rm -rf data/xxxx/blk_*
> 3) hadoop-dameon.sh stop datanode (on n1)
> 4) hadoop-damon.sh start datanode(on n1)
> 5) hadoop-dameon.sh stop datanode (on n2)
> 6) hadoop-damon.sh start datanode(on n2)
>
> Am I right? Thanks for any input.
>
>



-- 
Harsh J

Re: Can I move block data directly?

Posted by Harsh J <ha...@cloudera.com>.
Yes, you could do this, but I'd place steps 3 and 5 as steps 1 and 2.
I'd also ensure the ownership of the block files are intact.

If balancer isn't cutting it out for you with stock defaults, you
should consider tuning that than doing these unsupported scenarios.

On Mon, Jul 8, 2013 at 9:10 AM, Azuryy Yu <az...@gmail.com> wrote:
> Hi Dear all,
>
> There are some unbalanced data nodes in my cluster, some nodes reached more
> than 95% disk usage.
>
> so Can I move some block data from one node to another node directly?
>
> such as: from n1 to n2:
>
> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
> 2) rm -rf data/xxxx/blk_*
> 3) hadoop-dameon.sh stop datanode (on n1)
> 4) hadoop-damon.sh start datanode(on n1)
> 5) hadoop-dameon.sh stop datanode (on n2)
> 6) hadoop-damon.sh start datanode(on n2)
>
> Am I right? Thanks for any input.
>
>



-- 
Harsh J

Re: Can I move block data directly?

Posted by Harsh J <ha...@cloudera.com>.
Yes, you could do this, but I'd place steps 3 and 5 as steps 1 and 2.
I'd also ensure the ownership of the block files are intact.

If balancer isn't cutting it out for you with stock defaults, you
should consider tuning that than doing these unsupported scenarios.

On Mon, Jul 8, 2013 at 9:10 AM, Azuryy Yu <az...@gmail.com> wrote:
> Hi Dear all,
>
> There are some unbalanced data nodes in my cluster, some nodes reached more
> than 95% disk usage.
>
> so Can I move some block data from one node to another node directly?
>
> such as: from n1 to n2:
>
> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
> 2) rm -rf data/xxxx/blk_*
> 3) hadoop-dameon.sh stop datanode (on n1)
> 4) hadoop-damon.sh start datanode(on n1)
> 5) hadoop-dameon.sh stop datanode (on n2)
> 6) hadoop-damon.sh start datanode(on n2)
>
> Am I right? Thanks for any input.
>
>



-- 
Harsh J

Re: Can I move block data directly?

Posted by kishore alajangi <al...@gmail.com>.
run start-balancer.sh


On Mon, Jul 8, 2013 at 9:10 AM, Azuryy Yu <az...@gmail.com> wrote:

> Hi Dear all,
>
> There are some unbalanced data nodes in my cluster, some nodes reached
> more than 95% disk usage.
>
> so Can I move some block data from one node to another node directly?
>
> such as: from n1 to n2:
>
> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
> 2) rm -rf data/xxxx/blk_*
> 3) hadoop-dameon.sh stop datanode (on n1)
> 4) hadoop-damon.sh start datanode(on n1)
> 5) hadoop-dameon.sh stop datanode (on n2)
> 6) hadoop-damon.sh start datanode(on n2)
>
> Am I right? Thanks for any input.
>
>
>

Re: Can I move block data directly?

Posted by kishore alajangi <al...@gmail.com>.
run start-balancer.sh


On Mon, Jul 8, 2013 at 9:10 AM, Azuryy Yu <az...@gmail.com> wrote:

> Hi Dear all,
>
> There are some unbalanced data nodes in my cluster, some nodes reached
> more than 95% disk usage.
>
> so Can I move some block data from one node to another node directly?
>
> such as: from n1 to n2:
>
> 1) scp /data/xxxx/blk_*   n2:/data/subdir11/
> 2) rm -rf data/xxxx/blk_*
> 3) hadoop-dameon.sh stop datanode (on n1)
> 4) hadoop-damon.sh start datanode(on n1)
> 5) hadoop-dameon.sh stop datanode (on n2)
> 6) hadoop-damon.sh start datanode(on n2)
>
> Am I right? Thanks for any input.
>
>
>