You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by Alma Bob <al...@gmail.com> on 2014/11/26 20:11:58 UTC

Decommission does not move data

Hi,

I've been trying to remove nodes from the cluster and as it seems to me the Datanode decommission does not move any data from the nodes. If I check with fsck it reports missing blocks. I'm using Ambari 1.6.0. Is it supposed to move data at all or I should take care of it?

Best Regards,
Bob



Re: Decommission does not move data

Posted by Alma Bob <al...@gmail.com>.
That's really helpful. Thank you both of you.

On November 26, 2014 at 10:17:01 PM, Jaimin Jetly (jaimin@hortonworks.com) wrote:


Ambari client after triggering an decommission request checks for the NameNode JMX metrics channeled via ambari-server API to identify if the datanode has been successfully decommissioned or not.

The API that ambari-web client checks:
http://localhost:8080/api/v1/clusters/${clusterName}/services/HDFS/components/NAMENODE?fields=metrics/dfs/namenode/

Also I found a recent bug in HDFS that has been fixed recently and the fix would not have been part of HDP stack in Ambari-1.6.0 release:
https://issues.apache.org/jira/browse/HDFS-3087



Thanks

Jaimin Jetly






On Wed, Nov 26, 2014 at 1:02 PM, Alma Bob <al...@gmail.com> wrote:
Hi,

Probably that's the case yes. Is it possible to check the state of the decommission via Ambari or somehow?

Thank for the help anyway.
On November 26, 2014 at 9:45:25 PM, Yusaku Sako (yusaku@hortonworks.com) wrote:

Could this be just confusion resulting from how Ambari responds to
DataNode decommission requests?
Ambari receives a decommission request, it essentially updates the
"exclude" file to mark the host(s) for decommissioning and invokes
NameNode "refreshNodes" command.
This happens quickly, in seconds, and responds to the API client
saying it was performed successfully.
At this point, however, decommission is not done but it has just begun.

Yusaku

On Wed, Nov 26, 2014 at 12:36 PM, Alma Bob <al...@gmail.com> wrote:
> Hi,
>
> I'll start a new cluster to check it again and I'll report it back here.
> Maybe I can give access as well.
>
> @Yusaku, the decommission looks the same from both triggering from the UI
> and through the REST API as expected.
>
> Generally the decommission should take some time for obvious reasons, but it
> finished in a few seconds which made me wonder.
>
> On November 26, 2014 at 9:26:58 PM, Jaimin Jetly (jaimin@hortonworks.com)
> wrote:
>
> Hi Alma,
>
> Decommission in ideal scenario is expected to move data from the
> decommissioned node.
>
> Can you please provide information on what was the datanode status as per
> NameNode JMX metrics when you noticed this behavior? This information will
> help in debugging the issue further.
>
> For checking the decommission status of a node via NameNode jmx metrics,
> look for LiveNodes key at following url:
> http://c6401.ambari.apache.org:50070/jmx
>
> LiveNodes key keeps the status of each node under "adminState" attribute (In
> Service | Decommission In Progress | Decommissioned)
>
> Note:
> Decommission can take a while to finish for a moderate amount of data and so
> it is expected to take more time for 3TB data.
>
> If the datanode status was in "Decommission in Progress" at that time then
> this behavior is expected from HDFS as not all the block copies are moved to
> other nodes at that time.
>
>
> Thanks
>
> Jaimin Jetly
>
>
>
>
> On Wed, Nov 26, 2014 at 12:15 PM, Yusaku Sako <yu...@hortonworks.com>
> wrote:
>>
>> I see. Decommission of DataNodes via Ambari should automatically
>> start the process of moving off blocks to other remaining DataNodes to
>> ensure that the replication factor of 3 is reached.
>> How did you trigger decommission on the 5 DataNodes?
>> Once you trigger decommission, Ambari should show the DataNodes as
>> "Decommissioning" if you drill down to the Host Detail page of the
>> said hosts. Decommission process can take a long time, depending on
>> the number of blocks involved. You can also check the NameNode Web UI
>> (available from QuickLinks) to verify that the DataNodes are indeed
>> decommissioning.
>>
>> Yusaku
>>
>> On Wed, Nov 26, 2014 at 11:41 AM, Alma Bob <al...@gmail.com> wrote:
>> > In this test I tried with 20 nodes with replication 3. I generated 3TB
>> > data
>> > and started to decommission 5 nodes and the fsck reported as replication
>> > is
>> > 3 but found block 2 in many cases.
>> >
>> > On November 26, 2014 at 8:34:20 PM, Yusaku Sako (yusaku@hortonworks.com)
>> > wrote:
>> >
>> > How many DataNodes do you have in your cluster, and what is your
>> > replication factor (dfs.replication in hdfs-site.xml)?
>> >
>> > Yusaku
>> >
>> > On Wed, Nov 26, 2014 at 11:11 AM, Alma Bob <al...@gmail.com> wrote:
>> >> Hi,
>> >>
>> >> I've been trying to remove nodes from the cluster and as it seems to me
>> >> the Datanode decommission does not move any data from the nodes. If I
>> >> check
>> >> with fsck it reports missing blocks. I'm using Ambari 1.6.0. Is it
>> >> supposed
>> >> to move data at all or I should take care of it?
>> >>
>> >> Best Regards,
>> >> Bob
>> >>
>> >>
>> >
>> > --
>> > CONFIDENTIALITY NOTICE
>> > NOTICE: This message is intended for the use of the individual or entity
>> > to
>> > which it is addressed and may contain information that is confidential,
>> > privileged and exempt from disclosure under applicable law. If the
>> > reader
>> > of this message is not the intended recipient, you are hereby notified
>> > that
>> > any printing, copying, dissemination, distribution, disclosure or
>> > forwarding of this communication is strictly prohibited. If you have
>> > received this communication in error, please contact the sender
>> > immediately
>> > and delete it from your system. Thank You.
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified
>> that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender
>> immediately
>> and delete it from your system. Thank You.
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader of
> this message is not the intended recipient, you are hereby notified that any
> printing, copying, dissemination, distribution, disclosure or forwarding of
> this communication is strictly prohibited. If you have received this
> communication in error, please contact the sender immediately and delete it
> from your system. Thank You.

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

Re: Decommission does not move data

Posted by Jaimin Jetly <ja...@hortonworks.com>.
Ambari client after triggering an decommission request checks for the
NameNode JMX metrics channeled via ambari-server API to identify if the
datanode has been successfully decommissioned or not.

The API that ambari-web client checks:
http://localhost:8080/api/v1/clusters/${clusterName}/services/HDFS/components/NAMENODE?fields=metrics/dfs/namenode/

Also I found a recent bug in HDFS that has been fixed recently and the fix
would not have been part of HDP stack in Ambari-1.6.0 release:
https://issues.apache.org/jira/browse/HDFS-3087



Thanks

Jaimin Jetly




On Wed, Nov 26, 2014 at 1:02 PM, Alma Bob <al...@gmail.com> wrote:

> Hi,
>
> Probably that's the case yes. Is it possible to check the state of the
> decommission via Ambari or somehow?
>
> Thank for the help anyway.
>
> On November 26, 2014 at 9:45:25 PM, Yusaku Sako (yusaku@hortonworks.com)
> wrote:
>
> Could this be just confusion resulting from how Ambari responds to
> DataNode decommission requests?
> Ambari receives a decommission request, it essentially updates the
> "exclude" file to mark the host(s) for decommissioning and invokes
> NameNode "refreshNodes" command.
> This happens quickly, in seconds, and responds to the API client
> saying it was performed successfully.
> At this point, however, decommission is not done but it has just begun.
>
> Yusaku
>
> On Wed, Nov 26, 2014 at 12:36 PM, Alma Bob <al...@gmail.com> wrote:
> > Hi,
> >
> > I'll start a new cluster to check it again and I'll report it back here.
> > Maybe I can give access as well.
> >
> > @Yusaku, the decommission looks the same from both triggering from the
> UI
> > and through the REST API as expected.
> >
> > Generally the decommission should take some time for obvious reasons,
> but it
> > finished in a few seconds which made me wonder.
> >
> > On November 26, 2014 at 9:26:58 PM, Jaimin Jetly (jaimin@hortonworks.com)
>
> > wrote:
> >
> > Hi Alma,
> >
> > Decommission in ideal scenario is expected to move data from the
> > decommissioned node.
> >
> > Can you please provide information on what was the datanode status as
> per
> > NameNode JMX metrics when you noticed this behavior? This information
> will
> > help in debugging the issue further.
> >
> > For checking the decommission status of a node via NameNode jmx metrics,
> > look for LiveNodes key at following url:
> > http://c6401.ambari.apache.org:50070/jmx
> >
> > LiveNodes key keeps the status of each node under "adminState" attribute
> (In
> > Service | Decommission In Progress | Decommissioned)
> >
> > Note:
> > Decommission can take a while to finish for a moderate amount of data
> and so
> > it is expected to take more time for 3TB data.
> >
> > If the datanode status was in "Decommission in Progress" at that time
> then
> > this behavior is expected from HDFS as not all the block copies are
> moved to
> > other nodes at that time.
> >
> >
> > Thanks
> >
> > Jaimin Jetly
> >
> >
> >
> >
> > On Wed, Nov 26, 2014 at 12:15 PM, Yusaku Sako <yu...@hortonworks.com>
> > wrote:
> >>
> >> I see. Decommission of DataNodes via Ambari should automatically
> >> start the process of moving off blocks to other remaining DataNodes to
> >> ensure that the replication factor of 3 is reached.
> >> How did you trigger decommission on the 5 DataNodes?
> >> Once you trigger decommission, Ambari should show the DataNodes as
> >> "Decommissioning" if you drill down to the Host Detail page of the
> >> said hosts. Decommission process can take a long time, depending on
> >> the number of blocks involved. You can also check the NameNode Web UI
> >> (available from QuickLinks) to verify that the DataNodes are indeed
> >> decommissioning.
> >>
> >> Yusaku
> >>
> >> On Wed, Nov 26, 2014 at 11:41 AM, Alma Bob <al...@gmail.com> wrote:
> >> > In this test I tried with 20 nodes with replication 3. I generated
> 3TB
> >> > data
> >> > and started to decommission 5 nodes and the fsck reported as
> replication
> >> > is
> >> > 3 but found block 2 in many cases.
> >> >
> >> > On November 26, 2014 at 8:34:20 PM, Yusaku Sako (
> yusaku@hortonworks.com)
> >> > wrote:
> >> >
> >> > How many DataNodes do you have in your cluster, and what is your
> >> > replication factor (dfs.replication in hdfs-site.xml)?
> >> >
> >> > Yusaku
> >> >
> >> > On Wed, Nov 26, 2014 at 11:11 AM, Alma Bob <al...@gmail.com>
> wrote:
> >> >> Hi,
> >> >>
> >> >> I've been trying to remove nodes from the cluster and as it seems to
> me
> >> >> the Datanode decommission does not move any data from the nodes. If
> I
> >> >> check
> >> >> with fsck it reports missing blocks. I'm using Ambari 1.6.0. Is it
> >> >> supposed
> >> >> to move data at all or I should take care of it?
> >> >>
> >> >> Best Regards,
> >> >> Bob
> >> >>
> >> >>
> >> >
> >> > --
> >> > CONFIDENTIALITY NOTICE
> >> > NOTICE: This message is intended for the use of the individual or
> entity
> >> > to
> >> > which it is addressed and may contain information that is
> confidential,
> >> > privileged and exempt from disclosure under applicable law. If the
> >> > reader
> >> > of this message is not the intended recipient, you are hereby
> notified
> >> > that
> >> > any printing, copying, dissemination, distribution, disclosure or
> >> > forwarding of this communication is strictly prohibited. If you have
> >> > received this communication in error, please contact the sender
> >> > immediately
> >> > and delete it from your system. Thank You.
> >>
> >> --
> >> CONFIDENTIALITY NOTICE
> >> NOTICE: This message is intended for the use of the individual or
> entity
> >> to
> >> which it is addressed and may contain information that is confidential,
> >> privileged and exempt from disclosure under applicable law. If the
> reader
> >> of this message is not the intended recipient, you are hereby notified
> >> that
> >> any printing, copying, dissemination, distribution, disclosure or
> >> forwarding of this communication is strictly prohibited. If you have
> >> received this communication in error, please contact the sender
> >> immediately
> >> and delete it from your system. Thank You.
> >
> >
> >
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the
> reader of
> > this message is not the intended recipient, you are hereby notified that
> any
> > printing, copying, dissemination, distribution, disclosure or forwarding
> of
> > this communication is strictly prohibited. If you have received this
> > communication in error, please contact the sender immediately and delete
> it
> > from your system. Thank You.
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified
> that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender
> immediately
> and delete it from your system. Thank You.
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Decommission does not move data

Posted by Alma Bob <al...@gmail.com>.
Hi,

Probably that's the case yes. Is it possible to check the state of the decommission via Ambari or somehow?

Thank for the help anyway.
On November 26, 2014 at 9:45:25 PM, Yusaku Sako (yusaku@hortonworks.com) wrote:

Could this be just confusion resulting from how Ambari responds to 
DataNode decommission requests? 
Ambari receives a decommission request, it essentially updates the 
"exclude" file to mark the host(s) for decommissioning and invokes 
NameNode "refreshNodes" command. 
This happens quickly, in seconds, and responds to the API client 
saying it was performed successfully. 
At this point, however, decommission is not done but it has just begun. 

Yusaku 

On Wed, Nov 26, 2014 at 12:36 PM, Alma Bob <al...@gmail.com> wrote: 
> Hi, 
> 
> I'll start a new cluster to check it again and I'll report it back here. 
> Maybe I can give access as well. 
> 
> @Yusaku, the decommission looks the same from both triggering from the UI 
> and through the REST API as expected. 
> 
> Generally the decommission should take some time for obvious reasons, but it 
> finished in a few seconds which made me wonder. 
> 
> On November 26, 2014 at 9:26:58 PM, Jaimin Jetly (jaimin@hortonworks.com) 
> wrote: 
> 
> Hi Alma, 
> 
> Decommission in ideal scenario is expected to move data from the 
> decommissioned node. 
> 
> Can you please provide information on what was the datanode status as per 
> NameNode JMX metrics when you noticed this behavior? This information will 
> help in debugging the issue further. 
> 
> For checking the decommission status of a node via NameNode jmx metrics, 
> look for LiveNodes key at following url: 
> http://c6401.ambari.apache.org:50070/jmx 
> 
> LiveNodes key keeps the status of each node under "adminState" attribute (In 
> Service | Decommission In Progress | Decommissioned) 
> 
> Note: 
> Decommission can take a while to finish for a moderate amount of data and so 
> it is expected to take more time for 3TB data. 
> 
> If the datanode status was in "Decommission in Progress" at that time then 
> this behavior is expected from HDFS as not all the block copies are moved to 
> other nodes at that time. 
> 
> 
> Thanks 
> 
> Jaimin Jetly 
> 
> 
> 
> 
> On Wed, Nov 26, 2014 at 12:15 PM, Yusaku Sako <yu...@hortonworks.com> 
> wrote: 
>> 
>> I see. Decommission of DataNodes via Ambari should automatically 
>> start the process of moving off blocks to other remaining DataNodes to 
>> ensure that the replication factor of 3 is reached. 
>> How did you trigger decommission on the 5 DataNodes? 
>> Once you trigger decommission, Ambari should show the DataNodes as 
>> "Decommissioning" if you drill down to the Host Detail page of the 
>> said hosts. Decommission process can take a long time, depending on 
>> the number of blocks involved. You can also check the NameNode Web UI 
>> (available from QuickLinks) to verify that the DataNodes are indeed 
>> decommissioning. 
>> 
>> Yusaku 
>> 
>> On Wed, Nov 26, 2014 at 11:41 AM, Alma Bob <al...@gmail.com> wrote: 
>> > In this test I tried with 20 nodes with replication 3. I generated 3TB 
>> > data 
>> > and started to decommission 5 nodes and the fsck reported as replication 
>> > is 
>> > 3 but found block 2 in many cases. 
>> > 
>> > On November 26, 2014 at 8:34:20 PM, Yusaku Sako (yusaku@hortonworks.com) 
>> > wrote: 
>> > 
>> > How many DataNodes do you have in your cluster, and what is your 
>> > replication factor (dfs.replication in hdfs-site.xml)? 
>> > 
>> > Yusaku 
>> > 
>> > On Wed, Nov 26, 2014 at 11:11 AM, Alma Bob <al...@gmail.com> wrote: 
>> >> Hi, 
>> >> 
>> >> I've been trying to remove nodes from the cluster and as it seems to me 
>> >> the Datanode decommission does not move any data from the nodes. If I 
>> >> check 
>> >> with fsck it reports missing blocks. I'm using Ambari 1.6.0. Is it 
>> >> supposed 
>> >> to move data at all or I should take care of it? 
>> >> 
>> >> Best Regards, 
>> >> Bob 
>> >> 
>> >> 
>> > 
>> > -- 
>> > CONFIDENTIALITY NOTICE 
>> > NOTICE: This message is intended for the use of the individual or entity 
>> > to 
>> > which it is addressed and may contain information that is confidential, 
>> > privileged and exempt from disclosure under applicable law. If the 
>> > reader 
>> > of this message is not the intended recipient, you are hereby notified 
>> > that 
>> > any printing, copying, dissemination, distribution, disclosure or 
>> > forwarding of this communication is strictly prohibited. If you have 
>> > received this communication in error, please contact the sender 
>> > immediately 
>> > and delete it from your system. Thank You. 
>> 
>> -- 
>> CONFIDENTIALITY NOTICE 
>> NOTICE: This message is intended for the use of the individual or entity 
>> to 
>> which it is addressed and may contain information that is confidential, 
>> privileged and exempt from disclosure under applicable law. If the reader 
>> of this message is not the intended recipient, you are hereby notified 
>> that 
>> any printing, copying, dissemination, distribution, disclosure or 
>> forwarding of this communication is strictly prohibited. If you have 
>> received this communication in error, please contact the sender 
>> immediately 
>> and delete it from your system. Thank You. 
> 
> 
> 
> CONFIDENTIALITY NOTICE 
> NOTICE: This message is intended for the use of the individual or entity to 
> which it is addressed and may contain information that is confidential, 
> privileged and exempt from disclosure under applicable law. If the reader of 
> this message is not the intended recipient, you are hereby notified that any 
> printing, copying, dissemination, distribution, disclosure or forwarding of 
> this communication is strictly prohibited. If you have received this 
> communication in error, please contact the sender immediately and delete it 
> from your system. Thank You. 

-- 
CONFIDENTIALITY NOTICE 
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You. 

Re: Decommission does not move data

Posted by Yusaku Sako <yu...@hortonworks.com>.
Could this be just confusion resulting from how Ambari responds to
DataNode decommission requests?
Ambari receives a decommission request, it essentially updates the
"exclude" file to mark the host(s) for decommissioning and invokes
NameNode "refreshNodes" command.
This happens quickly, in seconds, and responds to the API client
saying it was performed successfully.
At this point, however, decommission is not done but it has just begun.

Yusaku

On Wed, Nov 26, 2014 at 12:36 PM, Alma Bob <al...@gmail.com> wrote:
> Hi,
>
> I'll start a new cluster to check it again and I'll report it back here.
> Maybe I can give access as well.
>
> @Yusaku, the decommission looks the same from both triggering from the UI
> and through the REST API as expected.
>
> Generally the decommission should take some time for obvious reasons, but it
> finished in a few seconds which made me wonder.
>
> On November 26, 2014 at 9:26:58 PM, Jaimin Jetly (jaimin@hortonworks.com)
> wrote:
>
> Hi Alma,
>
> Decommission in ideal scenario is expected to move data from the
> decommissioned node.
>
> Can you please provide information on what was the datanode status as per
> NameNode JMX metrics when you noticed this behavior? This information will
> help in debugging the issue further.
>
> For checking the decommission status of a node via NameNode jmx metrics,
> look for LiveNodes key at following url:
> http://c6401.ambari.apache.org:50070/jmx
>
> LiveNodes key keeps the status of each node under "adminState" attribute (In
> Service | Decommission In Progress | Decommissioned)
>
> Note:
> Decommission can take a while to finish for a moderate amount of data and so
> it is expected to take more time for 3TB data.
>
> If the datanode status was in "Decommission in Progress" at that time then
> this behavior is expected from HDFS as not all the block copies are moved to
> other nodes at that time.
>
>
> Thanks
>
> Jaimin Jetly
>
>
>
>
> On Wed, Nov 26, 2014 at 12:15 PM, Yusaku Sako <yu...@hortonworks.com>
> wrote:
>>
>> I see.  Decommission of DataNodes via Ambari should automatically
>> start the process of moving off blocks to other remaining DataNodes to
>> ensure that the replication factor of 3 is reached.
>> How did you trigger decommission on the 5 DataNodes?
>> Once you trigger decommission, Ambari should show the DataNodes as
>> "Decommissioning" if you drill down to the Host Detail page of the
>> said hosts.  Decommission process can take a long time, depending on
>> the number of blocks involved.  You can also check the NameNode Web UI
>> (available from QuickLinks) to verify that the DataNodes are indeed
>> decommissioning.
>>
>> Yusaku
>>
>> On Wed, Nov 26, 2014 at 11:41 AM, Alma Bob <al...@gmail.com> wrote:
>> > In this test I tried with 20 nodes with replication 3. I generated 3TB
>> > data
>> > and started to decommission 5 nodes and the fsck reported as replication
>> > is
>> > 3 but found block 2 in many cases.
>> >
>> > On November 26, 2014 at 8:34:20 PM, Yusaku Sako (yusaku@hortonworks.com)
>> > wrote:
>> >
>> > How many DataNodes do you have in your cluster, and what is your
>> > replication factor (dfs.replication in hdfs-site.xml)?
>> >
>> > Yusaku
>> >
>> > On Wed, Nov 26, 2014 at 11:11 AM, Alma Bob <al...@gmail.com> wrote:
>> >> Hi,
>> >>
>> >> I've been trying to remove nodes from the cluster and as it seems to me
>> >> the Datanode decommission does not move any data from the nodes. If I
>> >> check
>> >> with fsck it reports missing blocks. I'm using Ambari 1.6.0. Is it
>> >> supposed
>> >> to move data at all or I should take care of it?
>> >>
>> >> Best Regards,
>> >> Bob
>> >>
>> >>
>> >
>> > --
>> > CONFIDENTIALITY NOTICE
>> > NOTICE: This message is intended for the use of the individual or entity
>> > to
>> > which it is addressed and may contain information that is confidential,
>> > privileged and exempt from disclosure under applicable law. If the
>> > reader
>> > of this message is not the intended recipient, you are hereby notified
>> > that
>> > any printing, copying, dissemination, distribution, disclosure or
>> > forwarding of this communication is strictly prohibited. If you have
>> > received this communication in error, please contact the sender
>> > immediately
>> > and delete it from your system. Thank You.
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified
>> that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender
>> immediately
>> and delete it from your system. Thank You.
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader of
> this message is not the intended recipient, you are hereby notified that any
> printing, copying, dissemination, distribution, disclosure or forwarding of
> this communication is strictly prohibited. If you have received this
> communication in error, please contact the sender immediately and delete it
> from your system. Thank You.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Decommission does not move data

Posted by Alma Bob <al...@gmail.com>.
Hi,

I'll start a new cluster to check it again and I'll report it back here. Maybe I can give access as well.

@Yusaku, the decommission looks the same from both triggering from the UI and through the REST API as expected.

Generally the decommission should take some time for obvious reasons, but it finished in a few seconds which made me wonder.

On November 26, 2014 at 9:26:58 PM, Jaimin Jetly (jaimin@hortonworks.com) wrote:
Hi Alma,

Decommission in ideal scenario is expected to move data from the decommissioned node.

Can you please provide information on what was the datanode status as per NameNode JMX metrics when you noticed this behavior? This information will help in debugging the issue further.

For checking the decommission status of a node via NameNode jmx metrics, look for LiveNodes key at following url:
http://c6401.ambari.apache.org:50070/jmx

LiveNodes key keeps the status of each node under "adminState" attribute (In Service | Decommission In Progress | Decommissioned)

Note: 
Decommission can take a while to finish for a moderate amount of data and so it is expected to take more time for 3TB data.
 
If the datanode status was in "Decommission in Progress" at that time then this behavior is expected from HDFS as not all the block copies are moved to other nodes at that time.


Thanks

Jaimin Jetly






On Wed, Nov 26, 2014 at 12:15 PM, Yusaku Sako <yu...@hortonworks.com> wrote:
I see.  Decommission of DataNodes via Ambari should automatically
start the process of moving off blocks to other remaining DataNodes to
ensure that the replication factor of 3 is reached.
How did you trigger decommission on the 5 DataNodes?
Once you trigger decommission, Ambari should show the DataNodes as
"Decommissioning" if you drill down to the Host Detail page of the
said hosts.  Decommission process can take a long time, depending on
the number of blocks involved.  You can also check the NameNode Web UI
(available from QuickLinks) to verify that the DataNodes are indeed
decommissioning.

Yusaku

On Wed, Nov 26, 2014 at 11:41 AM, Alma Bob <al...@gmail.com> wrote:
> In this test I tried with 20 nodes with replication 3. I generated 3TB data
> and started to decommission 5 nodes and the fsck reported as replication is
> 3 but found block 2 in many cases.
>
> On November 26, 2014 at 8:34:20 PM, Yusaku Sako (yusaku@hortonworks.com)
> wrote:
>
> How many DataNodes do you have in your cluster, and what is your
> replication factor (dfs.replication in hdfs-site.xml)?
>
> Yusaku
>
> On Wed, Nov 26, 2014 at 11:11 AM, Alma Bob <al...@gmail.com> wrote:
>> Hi,
>>
>> I've been trying to remove nodes from the cluster and as it seems to me
>> the Datanode decommission does not move any data from the nodes. If I check
>> with fsck it reports missing blocks. I'm using Ambari 1.6.0. Is it supposed
>> to move data at all or I should take care of it?
>>
>> Best Regards,
>> Bob
>>
>>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

Re: Decommission does not move data

Posted by Jaimin Jetly <ja...@hortonworks.com>.
Hi Alma,

Decommission in ideal scenario is expected to move data from the
decommissioned node.

Can you please provide information on what was the datanode status as per
NameNode JMX metrics when you noticed this behavior? This information will
help in debugging the issue further.

For checking the decommission status of a node via NameNode jmx metrics,
look for LiveNodes key at following url:
http://c6401.ambari.apache.org:50070/jmx

LiveNodes key keeps the status of each node under "adminState" attribute
(In Service | Decommission In Progress | Decommissioned)

Note:
Decommission can take a while to finish for a moderate amount of data and
so it is expected to take more time for 3TB data.

If the datanode status was in "Decommission in Progress" at that time then
this behavior is expected from HDFS as not all the block copies are moved
to other nodes at that time.


Thanks

Jaimin Jetly




On Wed, Nov 26, 2014 at 12:15 PM, Yusaku Sako <yu...@hortonworks.com>
wrote:

> I see.  Decommission of DataNodes via Ambari should automatically
> start the process of moving off blocks to other remaining DataNodes to
> ensure that the replication factor of 3 is reached.
> How did you trigger decommission on the 5 DataNodes?
> Once you trigger decommission, Ambari should show the DataNodes as
> "Decommissioning" if you drill down to the Host Detail page of the
> said hosts.  Decommission process can take a long time, depending on
> the number of blocks involved.  You can also check the NameNode Web UI
> (available from QuickLinks) to verify that the DataNodes are indeed
> decommissioning.
>
> Yusaku
>
> On Wed, Nov 26, 2014 at 11:41 AM, Alma Bob <al...@gmail.com> wrote:
> > In this test I tried with 20 nodes with replication 3. I generated 3TB
> data
> > and started to decommission 5 nodes and the fsck reported as replication
> is
> > 3 but found block 2 in many cases.
> >
> > On November 26, 2014 at 8:34:20 PM, Yusaku Sako (yusaku@hortonworks.com)
> > wrote:
> >
> > How many DataNodes do you have in your cluster, and what is your
> > replication factor (dfs.replication in hdfs-site.xml)?
> >
> > Yusaku
> >
> > On Wed, Nov 26, 2014 at 11:11 AM, Alma Bob <al...@gmail.com> wrote:
> >> Hi,
> >>
> >> I've been trying to remove nodes from the cluster and as it seems to me
> >> the Datanode decommission does not move any data from the nodes. If I
> check
> >> with fsck it reports missing blocks. I'm using Ambari 1.6.0. Is it
> supposed
> >> to move data at all or I should take care of it?
> >>
> >> Best Regards,
> >> Bob
> >>
> >>
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Decommission does not move data

Posted by Yusaku Sako <yu...@hortonworks.com>.
I see.  Decommission of DataNodes via Ambari should automatically
start the process of moving off blocks to other remaining DataNodes to
ensure that the replication factor of 3 is reached.
How did you trigger decommission on the 5 DataNodes?
Once you trigger decommission, Ambari should show the DataNodes as
"Decommissioning" if you drill down to the Host Detail page of the
said hosts.  Decommission process can take a long time, depending on
the number of blocks involved.  You can also check the NameNode Web UI
(available from QuickLinks) to verify that the DataNodes are indeed
decommissioning.

Yusaku

On Wed, Nov 26, 2014 at 11:41 AM, Alma Bob <al...@gmail.com> wrote:
> In this test I tried with 20 nodes with replication 3. I generated 3TB data
> and started to decommission 5 nodes and the fsck reported as replication is
> 3 but found block 2 in many cases.
>
> On November 26, 2014 at 8:34:20 PM, Yusaku Sako (yusaku@hortonworks.com)
> wrote:
>
> How many DataNodes do you have in your cluster, and what is your
> replication factor (dfs.replication in hdfs-site.xml)?
>
> Yusaku
>
> On Wed, Nov 26, 2014 at 11:11 AM, Alma Bob <al...@gmail.com> wrote:
>> Hi,
>>
>> I've been trying to remove nodes from the cluster and as it seems to me
>> the Datanode decommission does not move any data from the nodes. If I check
>> with fsck it reports missing blocks. I'm using Ambari 1.6.0. Is it supposed
>> to move data at all or I should take care of it?
>>
>> Best Regards,
>> Bob
>>
>>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Decommission does not move data

Posted by Alma Bob <al...@gmail.com>.
In this test I tried with 20 nodes with replication 3. I generated 3TB data and started to decommission 5 nodes and the fsck reported as replication is 3 but found block 2 in many cases.
On November 26, 2014 at 8:34:20 PM, Yusaku Sako (yusaku@hortonworks.com) wrote:

How many DataNodes do you have in your cluster, and what is your 
replication factor (dfs.replication in hdfs-site.xml)? 

Yusaku 

On Wed, Nov 26, 2014 at 11:11 AM, Alma Bob <al...@gmail.com> wrote: 
> Hi, 
> 
> I've been trying to remove nodes from the cluster and as it seems to me the Datanode decommission does not move any data from the nodes. If I check with fsck it reports missing blocks. I'm using Ambari 1.6.0. Is it supposed to move data at all or I should take care of it? 
> 
> Best Regards, 
> Bob 
> 
> 

-- 
CONFIDENTIALITY NOTICE 
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You. 

Re: Decommission does not move data

Posted by Yusaku Sako <yu...@hortonworks.com>.
How many DataNodes do you have in your cluster, and what is your
replication factor (dfs.replication in hdfs-site.xml)?

Yusaku

On Wed, Nov 26, 2014 at 11:11 AM, Alma Bob <al...@gmail.com> wrote:
> Hi,
>
> I've been trying to remove nodes from the cluster and as it seems to me the Datanode decommission does not move any data from the nodes. If I check with fsck it reports missing blocks. I'm using Ambari 1.6.0. Is it supposed to move data at all or I should take care of it?
>
> Best Regards,
> Bob
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.