You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Arinto Murdopo <ar...@gmail.com> on 2013/10/10 09:32:20 UTC

Intermittent DataStreamer Exception while appending to file inside HDFS

Hi there,

I have this following exception while I'm appending existing file in my
HDFS. This error appears intermittently. If the error does not show up, I
can append the file successfully. If the error appears, I could not append
the file.

Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c
For your convenience, here it is:

13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception
java.io.IOException: Failed to add a datanode.  User may turn off this
feature by setting
dfs.client.block.write.replace-datanode-on-failure.policy in
configuration, where the current policy is DEFAULT.  (Nodes:
current=[10.0.106.82:50010, 10.0.106.81:50010],
original=[10.0.106.82:50010, 10.0.106.81:50010])
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)

Some configuration files:

1. hdfs-site.xml:
https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml

2. core-site.xml:
https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml

So, any idea how to solve this issue?

Some links that I've found (but unfortunately they do not help)
1. StackOverflow<http://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop>,
our replication factor is 3 and we've never changed the replication factor
since we setup the cluster.
2. Impala-User mailing
list<https://groups.google.com/a/cloudera.org/forum/#!searchin/impala-user/DataStreamer$20exception/impala-user/u2CN163Cyfc/_OcRqBYL2B4J>:
the error here is due to replication factor set to 1. In our case, we're
using replication factor = 3

Best regards,

Arinto
www.otnira.com

RE: Intermittent DataStreamer Exception while appending to file inside HDFS

Posted by Uma Maheswara Rao G <ma...@huawei.com>.
Hi Arinto,

You can check 3rd DN logs. Whether any space issues so that node was not selected for write etc.

> Does it mean that one of the datanodes was unreachable when we try to append into the files?
It did not select for write itself. If it failed after selected for write means you should have get this error while recovery itself.

Regards,
Uma

From: Arinto Murdopo [mailto:arinto@gmail.com]
Sent: 11 October 2013 08:48
To: user@hadoop.apache.org
Subject: Re: Intermittent DataStreamer Exception while appending to file inside HDFS

Thank you for the comprehensive answer,
When I inspect our NameNode UI, I see there are 3 datanodes are up.
However, as you mentioned, the log only showed 2 datanodes are up. Does it mean that one of the datanodes was unreachable when we try to append into the files?
Best regards,


Arinto
www.otnira.com<http://www.otnira.com>

On Thu, Oct 10, 2013 at 4:57 PM, Uma Maheswara Rao G <ma...@huawei.com>> wrote:
Hi Arinto,

Please disable this feature with smaller clusters. dfs.client.block.write.replace-datanode-on-failure.policy
Reason for this exception is, you have replication set to 3 and looks like you have only 2 nodes in the cluster from the logs. When you first time created pipeline we will not do any verification i.e, whether pipeline DNs met the replication or not. Above property says only replace DN on failure. But here additionally we took advantage of verifying this condition when we reopen the pipeline for append. So, here unfortunately it will not meet the replication with existing DNs and it will try to add another node. Since you are not having any extra nodes in cluster other than selected nodes, it will fail. With the current configurations you can not append.


Also please take a look at default configuration description:
<name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
  <value>true</value>
  <description>
    If there is a datanode/network failure in the write pipeline,
    DFSClient will try to remove the failed datanode from the pipeline
    and then continue writing with the remaining datanodes. As a result,
    the number of datanodes in the pipeline is decreased.  The feature is
    to add new datanodes to the pipeline.

    This is a site-wide property to enable/disable the feature.

    When the cluster size is extremely small, e.g. 3 nodes or less, cluster
    administrators may want to set the policy to NEVER in the default
    configuration file or disable this feature.  Otherwise, users may
    experience an unusually high rate of pipeline failures since it is
    impossible to find new datanodes for replacement.

    See also dfs.client.block.write.replace-datanode-on-failure.policy
  </description>


Make this configuration false at your client side.

Regards,
Uma


From: Arinto Murdopo [mailto:arinto@gmail.com<ma...@gmail.com>]
Sent: 10 October 2013 13:02
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Intermittent DataStreamer Exception while appending to file inside HDFS

Hi there,
I have this following exception while I'm appending existing file in my HDFS. This error appears intermittently. If the error does not show up, I can append the file successfully. If the error appears, I could not append the file.
Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c
For your convenience, here it is:

13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception

java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>], original=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>])

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)

Some configuration files:

1. hdfs-site.xml: https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml








2. core-site.xml: https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml







So, any idea how to solve this issue?
Some links that I've found (but unfortunately they do not help)
1. StackOverflow<http://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop>, our replication factor is 3 and we've never changed the replication factor since we setup the cluster.
2. Impala-User mailing list<https://groups.google.com/a/cloudera.org/forum/#!searchin/impala-user/DataStreamer$20exception/impala-user/u2CN163Cyfc/_OcRqBYL2B4J>: the error here is due to replication factor set to 1. In our case, we're using replication factor = 3

Best regards,

Arinto
www.otnira.com<http://www.otnira.com>


RE: Intermittent DataStreamer Exception while appending to file inside HDFS

Posted by Uma Maheswara Rao G <ma...@huawei.com>.
Hi Arinto,

You can check 3rd DN logs. Whether any space issues so that node was not selected for write etc.

> Does it mean that one of the datanodes was unreachable when we try to append into the files?
It did not select for write itself. If it failed after selected for write means you should have get this error while recovery itself.

Regards,
Uma

From: Arinto Murdopo [mailto:arinto@gmail.com]
Sent: 11 October 2013 08:48
To: user@hadoop.apache.org
Subject: Re: Intermittent DataStreamer Exception while appending to file inside HDFS

Thank you for the comprehensive answer,
When I inspect our NameNode UI, I see there are 3 datanodes are up.
However, as you mentioned, the log only showed 2 datanodes are up. Does it mean that one of the datanodes was unreachable when we try to append into the files?
Best regards,


Arinto
www.otnira.com<http://www.otnira.com>

On Thu, Oct 10, 2013 at 4:57 PM, Uma Maheswara Rao G <ma...@huawei.com>> wrote:
Hi Arinto,

Please disable this feature with smaller clusters. dfs.client.block.write.replace-datanode-on-failure.policy
Reason for this exception is, you have replication set to 3 and looks like you have only 2 nodes in the cluster from the logs. When you first time created pipeline we will not do any verification i.e, whether pipeline DNs met the replication or not. Above property says only replace DN on failure. But here additionally we took advantage of verifying this condition when we reopen the pipeline for append. So, here unfortunately it will not meet the replication with existing DNs and it will try to add another node. Since you are not having any extra nodes in cluster other than selected nodes, it will fail. With the current configurations you can not append.


Also please take a look at default configuration description:
<name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
  <value>true</value>
  <description>
    If there is a datanode/network failure in the write pipeline,
    DFSClient will try to remove the failed datanode from the pipeline
    and then continue writing with the remaining datanodes. As a result,
    the number of datanodes in the pipeline is decreased.  The feature is
    to add new datanodes to the pipeline.

    This is a site-wide property to enable/disable the feature.

    When the cluster size is extremely small, e.g. 3 nodes or less, cluster
    administrators may want to set the policy to NEVER in the default
    configuration file or disable this feature.  Otherwise, users may
    experience an unusually high rate of pipeline failures since it is
    impossible to find new datanodes for replacement.

    See also dfs.client.block.write.replace-datanode-on-failure.policy
  </description>


Make this configuration false at your client side.

Regards,
Uma


From: Arinto Murdopo [mailto:arinto@gmail.com<ma...@gmail.com>]
Sent: 10 October 2013 13:02
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Intermittent DataStreamer Exception while appending to file inside HDFS

Hi there,
I have this following exception while I'm appending existing file in my HDFS. This error appears intermittently. If the error does not show up, I can append the file successfully. If the error appears, I could not append the file.
Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c
For your convenience, here it is:

13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception

java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>], original=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>])

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)

Some configuration files:

1. hdfs-site.xml: https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml








2. core-site.xml: https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml







So, any idea how to solve this issue?
Some links that I've found (but unfortunately they do not help)
1. StackOverflow<http://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop>, our replication factor is 3 and we've never changed the replication factor since we setup the cluster.
2. Impala-User mailing list<https://groups.google.com/a/cloudera.org/forum/#!searchin/impala-user/DataStreamer$20exception/impala-user/u2CN163Cyfc/_OcRqBYL2B4J>: the error here is due to replication factor set to 1. In our case, we're using replication factor = 3

Best regards,

Arinto
www.otnira.com<http://www.otnira.com>


RE: Intermittent DataStreamer Exception while appending to file inside HDFS

Posted by Uma Maheswara Rao G <ma...@huawei.com>.
Hi Arinto,

You can check 3rd DN logs. Whether any space issues so that node was not selected for write etc.

> Does it mean that one of the datanodes was unreachable when we try to append into the files?
It did not select for write itself. If it failed after selected for write means you should have get this error while recovery itself.

Regards,
Uma

From: Arinto Murdopo [mailto:arinto@gmail.com]
Sent: 11 October 2013 08:48
To: user@hadoop.apache.org
Subject: Re: Intermittent DataStreamer Exception while appending to file inside HDFS

Thank you for the comprehensive answer,
When I inspect our NameNode UI, I see there are 3 datanodes are up.
However, as you mentioned, the log only showed 2 datanodes are up. Does it mean that one of the datanodes was unreachable when we try to append into the files?
Best regards,


Arinto
www.otnira.com<http://www.otnira.com>

On Thu, Oct 10, 2013 at 4:57 PM, Uma Maheswara Rao G <ma...@huawei.com>> wrote:
Hi Arinto,

Please disable this feature with smaller clusters. dfs.client.block.write.replace-datanode-on-failure.policy
Reason for this exception is, you have replication set to 3 and looks like you have only 2 nodes in the cluster from the logs. When you first time created pipeline we will not do any verification i.e, whether pipeline DNs met the replication or not. Above property says only replace DN on failure. But here additionally we took advantage of verifying this condition when we reopen the pipeline for append. So, here unfortunately it will not meet the replication with existing DNs and it will try to add another node. Since you are not having any extra nodes in cluster other than selected nodes, it will fail. With the current configurations you can not append.


Also please take a look at default configuration description:
<name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
  <value>true</value>
  <description>
    If there is a datanode/network failure in the write pipeline,
    DFSClient will try to remove the failed datanode from the pipeline
    and then continue writing with the remaining datanodes. As a result,
    the number of datanodes in the pipeline is decreased.  The feature is
    to add new datanodes to the pipeline.

    This is a site-wide property to enable/disable the feature.

    When the cluster size is extremely small, e.g. 3 nodes or less, cluster
    administrators may want to set the policy to NEVER in the default
    configuration file or disable this feature.  Otherwise, users may
    experience an unusually high rate of pipeline failures since it is
    impossible to find new datanodes for replacement.

    See also dfs.client.block.write.replace-datanode-on-failure.policy
  </description>


Make this configuration false at your client side.

Regards,
Uma


From: Arinto Murdopo [mailto:arinto@gmail.com<ma...@gmail.com>]
Sent: 10 October 2013 13:02
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Intermittent DataStreamer Exception while appending to file inside HDFS

Hi there,
I have this following exception while I'm appending existing file in my HDFS. This error appears intermittently. If the error does not show up, I can append the file successfully. If the error appears, I could not append the file.
Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c
For your convenience, here it is:

13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception

java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>], original=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>])

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)

Some configuration files:

1. hdfs-site.xml: https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml








2. core-site.xml: https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml







So, any idea how to solve this issue?
Some links that I've found (but unfortunately they do not help)
1. StackOverflow<http://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop>, our replication factor is 3 and we've never changed the replication factor since we setup the cluster.
2. Impala-User mailing list<https://groups.google.com/a/cloudera.org/forum/#!searchin/impala-user/DataStreamer$20exception/impala-user/u2CN163Cyfc/_OcRqBYL2B4J>: the error here is due to replication factor set to 1. In our case, we're using replication factor = 3

Best regards,

Arinto
www.otnira.com<http://www.otnira.com>


RE: Intermittent DataStreamer Exception while appending to file inside HDFS

Posted by Uma Maheswara Rao G <ma...@huawei.com>.
Hi Arinto,

You can check 3rd DN logs. Whether any space issues so that node was not selected for write etc.

> Does it mean that one of the datanodes was unreachable when we try to append into the files?
It did not select for write itself. If it failed after selected for write means you should have get this error while recovery itself.

Regards,
Uma

From: Arinto Murdopo [mailto:arinto@gmail.com]
Sent: 11 October 2013 08:48
To: user@hadoop.apache.org
Subject: Re: Intermittent DataStreamer Exception while appending to file inside HDFS

Thank you for the comprehensive answer,
When I inspect our NameNode UI, I see there are 3 datanodes are up.
However, as you mentioned, the log only showed 2 datanodes are up. Does it mean that one of the datanodes was unreachable when we try to append into the files?
Best regards,


Arinto
www.otnira.com<http://www.otnira.com>

On Thu, Oct 10, 2013 at 4:57 PM, Uma Maheswara Rao G <ma...@huawei.com>> wrote:
Hi Arinto,

Please disable this feature with smaller clusters. dfs.client.block.write.replace-datanode-on-failure.policy
Reason for this exception is, you have replication set to 3 and looks like you have only 2 nodes in the cluster from the logs. When you first time created pipeline we will not do any verification i.e, whether pipeline DNs met the replication or not. Above property says only replace DN on failure. But here additionally we took advantage of verifying this condition when we reopen the pipeline for append. So, here unfortunately it will not meet the replication with existing DNs and it will try to add another node. Since you are not having any extra nodes in cluster other than selected nodes, it will fail. With the current configurations you can not append.


Also please take a look at default configuration description:
<name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
  <value>true</value>
  <description>
    If there is a datanode/network failure in the write pipeline,
    DFSClient will try to remove the failed datanode from the pipeline
    and then continue writing with the remaining datanodes. As a result,
    the number of datanodes in the pipeline is decreased.  The feature is
    to add new datanodes to the pipeline.

    This is a site-wide property to enable/disable the feature.

    When the cluster size is extremely small, e.g. 3 nodes or less, cluster
    administrators may want to set the policy to NEVER in the default
    configuration file or disable this feature.  Otherwise, users may
    experience an unusually high rate of pipeline failures since it is
    impossible to find new datanodes for replacement.

    See also dfs.client.block.write.replace-datanode-on-failure.policy
  </description>


Make this configuration false at your client side.

Regards,
Uma


From: Arinto Murdopo [mailto:arinto@gmail.com<ma...@gmail.com>]
Sent: 10 October 2013 13:02
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Intermittent DataStreamer Exception while appending to file inside HDFS

Hi there,
I have this following exception while I'm appending existing file in my HDFS. This error appears intermittently. If the error does not show up, I can append the file successfully. If the error appears, I could not append the file.
Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c
For your convenience, here it is:

13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception

java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>], original=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>])

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)

Some configuration files:

1. hdfs-site.xml: https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml








2. core-site.xml: https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml







So, any idea how to solve this issue?
Some links that I've found (but unfortunately they do not help)
1. StackOverflow<http://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop>, our replication factor is 3 and we've never changed the replication factor since we setup the cluster.
2. Impala-User mailing list<https://groups.google.com/a/cloudera.org/forum/#!searchin/impala-user/DataStreamer$20exception/impala-user/u2CN163Cyfc/_OcRqBYL2B4J>: the error here is due to replication factor set to 1. In our case, we're using replication factor = 3

Best regards,

Arinto
www.otnira.com<http://www.otnira.com>


Re: Intermittent DataStreamer Exception while appending to file inside HDFS

Posted by Arinto Murdopo <ar...@gmail.com>.
Thank you for the comprehensive answer,

When I inspect our NameNode UI, I see there are 3 datanodes are up.
However, as you mentioned, the log only showed 2 datanodes are up. Does it
mean that one of the datanodes was unreachable when we try to append into
the files?

Best regards,


Arinto
www.otnira.com


On Thu, Oct 10, 2013 at 4:57 PM, Uma Maheswara Rao G
<ma...@huawei.com>wrote:

>  Hi Arinto,****
>
> ** **
>
> Please disable this feature with smaller clusters.
> dfs.client.block.write.replace-datanode-on-failure.policy****
>
> Reason for this exception is, you have replication set to 3 and looks like
> you have only 2 nodes in the cluster from the logs. When you first time
> created pipeline we will not do any verification i.e, whether pipeline DNs
> met the replication or not. Above property says only replace DN on failure.
> But here additionally we took advantage of verifying this condition when we
> reopen the pipeline for append. So, here unfortunately it will not meet the
> replication with existing DNs and it will try to add another node. Since
> you are not having any extra nodes in cluster other than selected nodes, it
> will fail. With the current configurations you can not append. ****
>
> ** **
>
> ** **
>
> Also please take a look at default configuration description:****
>
> <name>dfs.client.block.write.replace-datanode-on-failure.enable</name>****
>
>   <value>true</value>****
>
>   <description>****
>
>     If there is a datanode/network failure in the write pipeline,****
>
>     DFSClient will try to remove the failed datanode from the pipeline****
>
>     and then continue writing with the remaining datanodes. As a result,**
> **
>
>     the number of datanodes in the pipeline is decreased.  The feature is*
> ***
>
>     to add new datanodes to the pipeline.****
>
> ** **
>
>     This is a site-wide property to enable/disable the feature.****
>
> ** **
>
>     When the cluster size is extremely small, e.g. 3 nodes or less, cluster
> ****
>
>     administrators may want to set the policy to NEVER in the default****
>
>     configuration file or disable this feature.  Otherwise, users may****
>
>     experience an unusually high rate of pipeline failures since it is****
>
>     impossible to find new datanodes for replacement.****
>
> ** **
>
>     See also dfs.client.block.write.replace-datanode-on-failure.policy****
>
>   </description>****
>
> ** **
>
> ** **
>
> Make this configuration false at your client side.****
>
> ** **
>
> Regards,****
>
> Uma ** **
>
> ** **
>
> ** **
>
> *From:* Arinto Murdopo [mailto:arinto@gmail.com]
> *Sent:* 10 October 2013 13:02
> *To:* user@hadoop.apache.org
> *Subject:* Intermittent DataStreamer Exception while appending to file
> inside HDFS****
>
> ** **
>
> Hi there, ****
>
> I have this following exception while I'm appending existing file in my
> HDFS. This error appears intermittently. If the error does not show up, I
> can append the file successfully. If the error appears, I could not append
> the file.****
>
> Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c****
>
> For your convenience, here it is:****
>
> 13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception****
>
> java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[10.0.106.82:50010, 10.0.106.81:50010], original=[10.0.106.82:50010, 10.0.106.81:50010])****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)****
>
> Some configuration files:****
>
> 1. hdfs-site.xml: https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml
>
> ****
>
> ** **
>
> 2. core-site.xml: https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml
>
> ****
>
> ** **
>
> So, any idea how to solve this issue? ****
>
> Some links that I've found (but unfortunately they do not help)****
>
> 1. StackOverflow<http://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop>,
> our replication factor is 3 and we've never changed the replication factor
> since we setup the cluster. ****
>
> 2. Impala-User mailing list<https://groups.google.com/a/cloudera.org/forum/#!searchin/impala-user/DataStreamer$20exception/impala-user/u2CN163Cyfc/_OcRqBYL2B4J>:
> the error here is due to replication factor set to 1. In our case, we're
> using replication factor = 3****
>
> ** **
>
> Best regards, ****
>
> ** **
>
> Arinto****
>
> www.otnira.com****
>

Re: Intermittent DataStreamer Exception while appending to file inside HDFS

Posted by Arinto Murdopo <ar...@gmail.com>.
Thank you for the comprehensive answer,

When I inspect our NameNode UI, I see there are 3 datanodes are up.
However, as you mentioned, the log only showed 2 datanodes are up. Does it
mean that one of the datanodes was unreachable when we try to append into
the files?

Best regards,


Arinto
www.otnira.com


On Thu, Oct 10, 2013 at 4:57 PM, Uma Maheswara Rao G
<ma...@huawei.com>wrote:

>  Hi Arinto,****
>
> ** **
>
> Please disable this feature with smaller clusters.
> dfs.client.block.write.replace-datanode-on-failure.policy****
>
> Reason for this exception is, you have replication set to 3 and looks like
> you have only 2 nodes in the cluster from the logs. When you first time
> created pipeline we will not do any verification i.e, whether pipeline DNs
> met the replication or not. Above property says only replace DN on failure.
> But here additionally we took advantage of verifying this condition when we
> reopen the pipeline for append. So, here unfortunately it will not meet the
> replication with existing DNs and it will try to add another node. Since
> you are not having any extra nodes in cluster other than selected nodes, it
> will fail. With the current configurations you can not append. ****
>
> ** **
>
> ** **
>
> Also please take a look at default configuration description:****
>
> <name>dfs.client.block.write.replace-datanode-on-failure.enable</name>****
>
>   <value>true</value>****
>
>   <description>****
>
>     If there is a datanode/network failure in the write pipeline,****
>
>     DFSClient will try to remove the failed datanode from the pipeline****
>
>     and then continue writing with the remaining datanodes. As a result,**
> **
>
>     the number of datanodes in the pipeline is decreased.  The feature is*
> ***
>
>     to add new datanodes to the pipeline.****
>
> ** **
>
>     This is a site-wide property to enable/disable the feature.****
>
> ** **
>
>     When the cluster size is extremely small, e.g. 3 nodes or less, cluster
> ****
>
>     administrators may want to set the policy to NEVER in the default****
>
>     configuration file or disable this feature.  Otherwise, users may****
>
>     experience an unusually high rate of pipeline failures since it is****
>
>     impossible to find new datanodes for replacement.****
>
> ** **
>
>     See also dfs.client.block.write.replace-datanode-on-failure.policy****
>
>   </description>****
>
> ** **
>
> ** **
>
> Make this configuration false at your client side.****
>
> ** **
>
> Regards,****
>
> Uma ** **
>
> ** **
>
> ** **
>
> *From:* Arinto Murdopo [mailto:arinto@gmail.com]
> *Sent:* 10 October 2013 13:02
> *To:* user@hadoop.apache.org
> *Subject:* Intermittent DataStreamer Exception while appending to file
> inside HDFS****
>
> ** **
>
> Hi there, ****
>
> I have this following exception while I'm appending existing file in my
> HDFS. This error appears intermittently. If the error does not show up, I
> can append the file successfully. If the error appears, I could not append
> the file.****
>
> Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c****
>
> For your convenience, here it is:****
>
> 13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception****
>
> java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[10.0.106.82:50010, 10.0.106.81:50010], original=[10.0.106.82:50010, 10.0.106.81:50010])****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)****
>
> Some configuration files:****
>
> 1. hdfs-site.xml: https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml
>
> ****
>
> ** **
>
> 2. core-site.xml: https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml
>
> ****
>
> ** **
>
> So, any idea how to solve this issue? ****
>
> Some links that I've found (but unfortunately they do not help)****
>
> 1. StackOverflow<http://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop>,
> our replication factor is 3 and we've never changed the replication factor
> since we setup the cluster. ****
>
> 2. Impala-User mailing list<https://groups.google.com/a/cloudera.org/forum/#!searchin/impala-user/DataStreamer$20exception/impala-user/u2CN163Cyfc/_OcRqBYL2B4J>:
> the error here is due to replication factor set to 1. In our case, we're
> using replication factor = 3****
>
> ** **
>
> Best regards, ****
>
> ** **
>
> Arinto****
>
> www.otnira.com****
>

Re: Intermittent DataStreamer Exception while appending to file inside HDFS

Posted by Arinto Murdopo <ar...@gmail.com>.
Thank you for the comprehensive answer,

When I inspect our NameNode UI, I see there are 3 datanodes are up.
However, as you mentioned, the log only showed 2 datanodes are up. Does it
mean that one of the datanodes was unreachable when we try to append into
the files?

Best regards,


Arinto
www.otnira.com


On Thu, Oct 10, 2013 at 4:57 PM, Uma Maheswara Rao G
<ma...@huawei.com>wrote:

>  Hi Arinto,****
>
> ** **
>
> Please disable this feature with smaller clusters.
> dfs.client.block.write.replace-datanode-on-failure.policy****
>
> Reason for this exception is, you have replication set to 3 and looks like
> you have only 2 nodes in the cluster from the logs. When you first time
> created pipeline we will not do any verification i.e, whether pipeline DNs
> met the replication or not. Above property says only replace DN on failure.
> But here additionally we took advantage of verifying this condition when we
> reopen the pipeline for append. So, here unfortunately it will not meet the
> replication with existing DNs and it will try to add another node. Since
> you are not having any extra nodes in cluster other than selected nodes, it
> will fail. With the current configurations you can not append. ****
>
> ** **
>
> ** **
>
> Also please take a look at default configuration description:****
>
> <name>dfs.client.block.write.replace-datanode-on-failure.enable</name>****
>
>   <value>true</value>****
>
>   <description>****
>
>     If there is a datanode/network failure in the write pipeline,****
>
>     DFSClient will try to remove the failed datanode from the pipeline****
>
>     and then continue writing with the remaining datanodes. As a result,**
> **
>
>     the number of datanodes in the pipeline is decreased.  The feature is*
> ***
>
>     to add new datanodes to the pipeline.****
>
> ** **
>
>     This is a site-wide property to enable/disable the feature.****
>
> ** **
>
>     When the cluster size is extremely small, e.g. 3 nodes or less, cluster
> ****
>
>     administrators may want to set the policy to NEVER in the default****
>
>     configuration file or disable this feature.  Otherwise, users may****
>
>     experience an unusually high rate of pipeline failures since it is****
>
>     impossible to find new datanodes for replacement.****
>
> ** **
>
>     See also dfs.client.block.write.replace-datanode-on-failure.policy****
>
>   </description>****
>
> ** **
>
> ** **
>
> Make this configuration false at your client side.****
>
> ** **
>
> Regards,****
>
> Uma ** **
>
> ** **
>
> ** **
>
> *From:* Arinto Murdopo [mailto:arinto@gmail.com]
> *Sent:* 10 October 2013 13:02
> *To:* user@hadoop.apache.org
> *Subject:* Intermittent DataStreamer Exception while appending to file
> inside HDFS****
>
> ** **
>
> Hi there, ****
>
> I have this following exception while I'm appending existing file in my
> HDFS. This error appears intermittently. If the error does not show up, I
> can append the file successfully. If the error appears, I could not append
> the file.****
>
> Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c****
>
> For your convenience, here it is:****
>
> 13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception****
>
> java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[10.0.106.82:50010, 10.0.106.81:50010], original=[10.0.106.82:50010, 10.0.106.81:50010])****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)****
>
> Some configuration files:****
>
> 1. hdfs-site.xml: https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml
>
> ****
>
> ** **
>
> 2. core-site.xml: https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml
>
> ****
>
> ** **
>
> So, any idea how to solve this issue? ****
>
> Some links that I've found (but unfortunately they do not help)****
>
> 1. StackOverflow<http://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop>,
> our replication factor is 3 and we've never changed the replication factor
> since we setup the cluster. ****
>
> 2. Impala-User mailing list<https://groups.google.com/a/cloudera.org/forum/#!searchin/impala-user/DataStreamer$20exception/impala-user/u2CN163Cyfc/_OcRqBYL2B4J>:
> the error here is due to replication factor set to 1. In our case, we're
> using replication factor = 3****
>
> ** **
>
> Best regards, ****
>
> ** **
>
> Arinto****
>
> www.otnira.com****
>

Re: Intermittent DataStreamer Exception while appending to file inside HDFS

Posted by Arinto Murdopo <ar...@gmail.com>.
Thank you for the comprehensive answer,

When I inspect our NameNode UI, I see there are 3 datanodes are up.
However, as you mentioned, the log only showed 2 datanodes are up. Does it
mean that one of the datanodes was unreachable when we try to append into
the files?

Best regards,


Arinto
www.otnira.com


On Thu, Oct 10, 2013 at 4:57 PM, Uma Maheswara Rao G
<ma...@huawei.com>wrote:

>  Hi Arinto,****
>
> ** **
>
> Please disable this feature with smaller clusters.
> dfs.client.block.write.replace-datanode-on-failure.policy****
>
> Reason for this exception is, you have replication set to 3 and looks like
> you have only 2 nodes in the cluster from the logs. When you first time
> created pipeline we will not do any verification i.e, whether pipeline DNs
> met the replication or not. Above property says only replace DN on failure.
> But here additionally we took advantage of verifying this condition when we
> reopen the pipeline for append. So, here unfortunately it will not meet the
> replication with existing DNs and it will try to add another node. Since
> you are not having any extra nodes in cluster other than selected nodes, it
> will fail. With the current configurations you can not append. ****
>
> ** **
>
> ** **
>
> Also please take a look at default configuration description:****
>
> <name>dfs.client.block.write.replace-datanode-on-failure.enable</name>****
>
>   <value>true</value>****
>
>   <description>****
>
>     If there is a datanode/network failure in the write pipeline,****
>
>     DFSClient will try to remove the failed datanode from the pipeline****
>
>     and then continue writing with the remaining datanodes. As a result,**
> **
>
>     the number of datanodes in the pipeline is decreased.  The feature is*
> ***
>
>     to add new datanodes to the pipeline.****
>
> ** **
>
>     This is a site-wide property to enable/disable the feature.****
>
> ** **
>
>     When the cluster size is extremely small, e.g. 3 nodes or less, cluster
> ****
>
>     administrators may want to set the policy to NEVER in the default****
>
>     configuration file or disable this feature.  Otherwise, users may****
>
>     experience an unusually high rate of pipeline failures since it is****
>
>     impossible to find new datanodes for replacement.****
>
> ** **
>
>     See also dfs.client.block.write.replace-datanode-on-failure.policy****
>
>   </description>****
>
> ** **
>
> ** **
>
> Make this configuration false at your client side.****
>
> ** **
>
> Regards,****
>
> Uma ** **
>
> ** **
>
> ** **
>
> *From:* Arinto Murdopo [mailto:arinto@gmail.com]
> *Sent:* 10 October 2013 13:02
> *To:* user@hadoop.apache.org
> *Subject:* Intermittent DataStreamer Exception while appending to file
> inside HDFS****
>
> ** **
>
> Hi there, ****
>
> I have this following exception while I'm appending existing file in my
> HDFS. This error appears intermittently. If the error does not show up, I
> can append the file successfully. If the error appears, I could not append
> the file.****
>
> Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c****
>
> For your convenience, here it is:****
>
> 13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception****
>
> java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[10.0.106.82:50010, 10.0.106.81:50010], original=[10.0.106.82:50010, 10.0.106.81:50010])****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)****
>
>                at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)****
>
> Some configuration files:****
>
> 1. hdfs-site.xml: https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml
>
> ****
>
> ** **
>
> 2. core-site.xml: https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml
>
> ****
>
> ** **
>
> So, any idea how to solve this issue? ****
>
> Some links that I've found (but unfortunately they do not help)****
>
> 1. StackOverflow<http://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop>,
> our replication factor is 3 and we've never changed the replication factor
> since we setup the cluster. ****
>
> 2. Impala-User mailing list<https://groups.google.com/a/cloudera.org/forum/#!searchin/impala-user/DataStreamer$20exception/impala-user/u2CN163Cyfc/_OcRqBYL2B4J>:
> the error here is due to replication factor set to 1. In our case, we're
> using replication factor = 3****
>
> ** **
>
> Best regards, ****
>
> ** **
>
> Arinto****
>
> www.otnira.com****
>

RE: Intermittent DataStreamer Exception while appending to file inside HDFS

Posted by Uma Maheswara Rao G <ma...@huawei.com>.
Hi Arinto,

Please disable this feature with smaller clusters. dfs.client.block.write.replace-datanode-on-failure.policy
Reason for this exception is, you have replication set to 3 and looks like you have only 2 nodes in the cluster from the logs. When you first time created pipeline we will not do any verification i.e, whether pipeline DNs met the replication or not. Above property says only replace DN on failure. But here additionally we took advantage of verifying this condition when we reopen the pipeline for append. So, here unfortunately it will not meet the replication with existing DNs and it will try to add another node. Since you are not having any extra nodes in cluster other than selected nodes, it will fail. With the current configurations you can not append.


Also please take a look at default configuration description:
<name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
  <value>true</value>
  <description>
    If there is a datanode/network failure in the write pipeline,
    DFSClient will try to remove the failed datanode from the pipeline
    and then continue writing with the remaining datanodes. As a result,
    the number of datanodes in the pipeline is decreased.  The feature is
    to add new datanodes to the pipeline.

    This is a site-wide property to enable/disable the feature.

    When the cluster size is extremely small, e.g. 3 nodes or less, cluster
    administrators may want to set the policy to NEVER in the default
    configuration file or disable this feature.  Otherwise, users may
    experience an unusually high rate of pipeline failures since it is
    impossible to find new datanodes for replacement.

    See also dfs.client.block.write.replace-datanode-on-failure.policy
  </description>


Make this configuration false at your client side.

Regards,
Uma


From: Arinto Murdopo [mailto:arinto@gmail.com]
Sent: 10 October 2013 13:02
To: user@hadoop.apache.org
Subject: Intermittent DataStreamer Exception while appending to file inside HDFS

Hi there,
I have this following exception while I'm appending existing file in my HDFS. This error appears intermittently. If the error does not show up, I can append the file successfully. If the error appears, I could not append the file.
Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c
For your convenience, here it is:

13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception

java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>], original=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>])

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)

Some configuration files:

1. hdfs-site.xml: https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml




2. core-site.xml: https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml



So, any idea how to solve this issue?
Some links that I've found (but unfortunately they do not help)
1. StackOverflow<http://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop>, our replication factor is 3 and we've never changed the replication factor since we setup the cluster.
2. Impala-User mailing list<https://groups.google.com/a/cloudera.org/forum/#!searchin/impala-user/DataStreamer$20exception/impala-user/u2CN163Cyfc/_OcRqBYL2B4J>: the error here is due to replication factor set to 1. In our case, we're using replication factor = 3

Best regards,

Arinto
www.otnira.com<http://www.otnira.com>

RE: Intermittent DataStreamer Exception while appending to file inside HDFS

Posted by Uma Maheswara Rao G <ma...@huawei.com>.
Hi Arinto,

Please disable this feature with smaller clusters. dfs.client.block.write.replace-datanode-on-failure.policy
Reason for this exception is, you have replication set to 3 and looks like you have only 2 nodes in the cluster from the logs. When you first time created pipeline we will not do any verification i.e, whether pipeline DNs met the replication or not. Above property says only replace DN on failure. But here additionally we took advantage of verifying this condition when we reopen the pipeline for append. So, here unfortunately it will not meet the replication with existing DNs and it will try to add another node. Since you are not having any extra nodes in cluster other than selected nodes, it will fail. With the current configurations you can not append.


Also please take a look at default configuration description:
<name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
  <value>true</value>
  <description>
    If there is a datanode/network failure in the write pipeline,
    DFSClient will try to remove the failed datanode from the pipeline
    and then continue writing with the remaining datanodes. As a result,
    the number of datanodes in the pipeline is decreased.  The feature is
    to add new datanodes to the pipeline.

    This is a site-wide property to enable/disable the feature.

    When the cluster size is extremely small, e.g. 3 nodes or less, cluster
    administrators may want to set the policy to NEVER in the default
    configuration file or disable this feature.  Otherwise, users may
    experience an unusually high rate of pipeline failures since it is
    impossible to find new datanodes for replacement.

    See also dfs.client.block.write.replace-datanode-on-failure.policy
  </description>


Make this configuration false at your client side.

Regards,
Uma


From: Arinto Murdopo [mailto:arinto@gmail.com]
Sent: 10 October 2013 13:02
To: user@hadoop.apache.org
Subject: Intermittent DataStreamer Exception while appending to file inside HDFS

Hi there,
I have this following exception while I'm appending existing file in my HDFS. This error appears intermittently. If the error does not show up, I can append the file successfully. If the error appears, I could not append the file.
Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c
For your convenience, here it is:

13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception

java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>], original=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>])

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)

Some configuration files:

1. hdfs-site.xml: https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml




2. core-site.xml: https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml



So, any idea how to solve this issue?
Some links that I've found (but unfortunately they do not help)
1. StackOverflow<http://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop>, our replication factor is 3 and we've never changed the replication factor since we setup the cluster.
2. Impala-User mailing list<https://groups.google.com/a/cloudera.org/forum/#!searchin/impala-user/DataStreamer$20exception/impala-user/u2CN163Cyfc/_OcRqBYL2B4J>: the error here is due to replication factor set to 1. In our case, we're using replication factor = 3

Best regards,

Arinto
www.otnira.com<http://www.otnira.com>

RE: Intermittent DataStreamer Exception while appending to file inside HDFS

Posted by Uma Maheswara Rao G <ma...@huawei.com>.
Hi Arinto,

Please disable this feature with smaller clusters. dfs.client.block.write.replace-datanode-on-failure.policy
Reason for this exception is, you have replication set to 3 and looks like you have only 2 nodes in the cluster from the logs. When you first time created pipeline we will not do any verification i.e, whether pipeline DNs met the replication or not. Above property says only replace DN on failure. But here additionally we took advantage of verifying this condition when we reopen the pipeline for append. So, here unfortunately it will not meet the replication with existing DNs and it will try to add another node. Since you are not having any extra nodes in cluster other than selected nodes, it will fail. With the current configurations you can not append.


Also please take a look at default configuration description:
<name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
  <value>true</value>
  <description>
    If there is a datanode/network failure in the write pipeline,
    DFSClient will try to remove the failed datanode from the pipeline
    and then continue writing with the remaining datanodes. As a result,
    the number of datanodes in the pipeline is decreased.  The feature is
    to add new datanodes to the pipeline.

    This is a site-wide property to enable/disable the feature.

    When the cluster size is extremely small, e.g. 3 nodes or less, cluster
    administrators may want to set the policy to NEVER in the default
    configuration file or disable this feature.  Otherwise, users may
    experience an unusually high rate of pipeline failures since it is
    impossible to find new datanodes for replacement.

    See also dfs.client.block.write.replace-datanode-on-failure.policy
  </description>


Make this configuration false at your client side.

Regards,
Uma


From: Arinto Murdopo [mailto:arinto@gmail.com]
Sent: 10 October 2013 13:02
To: user@hadoop.apache.org
Subject: Intermittent DataStreamer Exception while appending to file inside HDFS

Hi there,
I have this following exception while I'm appending existing file in my HDFS. This error appears intermittently. If the error does not show up, I can append the file successfully. If the error appears, I could not append the file.
Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c
For your convenience, here it is:

13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception

java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>], original=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>])

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)

Some configuration files:

1. hdfs-site.xml: https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml




2. core-site.xml: https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml



So, any idea how to solve this issue?
Some links that I've found (but unfortunately they do not help)
1. StackOverflow<http://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop>, our replication factor is 3 and we've never changed the replication factor since we setup the cluster.
2. Impala-User mailing list<https://groups.google.com/a/cloudera.org/forum/#!searchin/impala-user/DataStreamer$20exception/impala-user/u2CN163Cyfc/_OcRqBYL2B4J>: the error here is due to replication factor set to 1. In our case, we're using replication factor = 3

Best regards,

Arinto
www.otnira.com<http://www.otnira.com>

RE: Intermittent DataStreamer Exception while appending to file inside HDFS

Posted by Uma Maheswara Rao G <ma...@huawei.com>.
Hi Arinto,

Please disable this feature with smaller clusters. dfs.client.block.write.replace-datanode-on-failure.policy
Reason for this exception is, you have replication set to 3 and looks like you have only 2 nodes in the cluster from the logs. When you first time created pipeline we will not do any verification i.e, whether pipeline DNs met the replication or not. Above property says only replace DN on failure. But here additionally we took advantage of verifying this condition when we reopen the pipeline for append. So, here unfortunately it will not meet the replication with existing DNs and it will try to add another node. Since you are not having any extra nodes in cluster other than selected nodes, it will fail. With the current configurations you can not append.


Also please take a look at default configuration description:
<name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
  <value>true</value>
  <description>
    If there is a datanode/network failure in the write pipeline,
    DFSClient will try to remove the failed datanode from the pipeline
    and then continue writing with the remaining datanodes. As a result,
    the number of datanodes in the pipeline is decreased.  The feature is
    to add new datanodes to the pipeline.

    This is a site-wide property to enable/disable the feature.

    When the cluster size is extremely small, e.g. 3 nodes or less, cluster
    administrators may want to set the policy to NEVER in the default
    configuration file or disable this feature.  Otherwise, users may
    experience an unusually high rate of pipeline failures since it is
    impossible to find new datanodes for replacement.

    See also dfs.client.block.write.replace-datanode-on-failure.policy
  </description>


Make this configuration false at your client side.

Regards,
Uma


From: Arinto Murdopo [mailto:arinto@gmail.com]
Sent: 10 October 2013 13:02
To: user@hadoop.apache.org
Subject: Intermittent DataStreamer Exception while appending to file inside HDFS

Hi there,
I have this following exception while I'm appending existing file in my HDFS. This error appears intermittently. If the error does not show up, I can append the file successfully. If the error appears, I could not append the file.
Here is the error: https://gist.github.com/arinto/d37a56f449c61c9d1d9c
For your convenience, here it is:

13/10/10 14:17:30 WARN hdfs.DFSClient: DataStreamer Exception

java.io.IOException: Failed to add a datanode.  User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT.  (Nodes: current=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>], original=[10.0.106.82:50010<http://10.0.106.82:50010>, 10.0.106.81:50010<http://10.0.106.81:50010>])

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:838)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)

               at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)

Some configuration files:

1. hdfs-site.xml: https://gist.github.com/arinto/f5f1522a6f6994ddfc17#file-hdfs-append-datastream-exception-hdfs-site-xml




2. core-site.xml: https://gist.github.com/arinto/0c6f40872181fe26f8b1#file-hdfs-append-datastream-exception-core-site-xml



So, any idea how to solve this issue?
Some links that I've found (but unfortunately they do not help)
1. StackOverflow<http://stackoverflow.com/questions/15347799/java-io-ioexception-failed-to-add-a-datanode-hdfs-hadoop>, our replication factor is 3 and we've never changed the replication factor since we setup the cluster.
2. Impala-User mailing list<https://groups.google.com/a/cloudera.org/forum/#!searchin/impala-user/DataStreamer$20exception/impala-user/u2CN163Cyfc/_OcRqBYL2B4J>: the error here is due to replication factor set to 1. In our case, we're using replication factor = 3

Best regards,

Arinto
www.otnira.com<http://www.otnira.com>