You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Hairong Kuang (JIRA)" <ji...@apache.org> on 2009/09/21 08:00:18 UTC

[jira] Created: (HDFS-637) DataNode sends an Success ack when block write fails

DataNode sends an Success ack when block write fails
----------------------------------------------------

                 Key: HDFS-637
                 URL: https://issues.apache.org/jira/browse/HDFS-637
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: data-node
            Reporter: Hairong Kuang
            Assignee: Hairong Kuang
            Priority: Blocker
             Fix For: 0.21.0


When I work on HDFS-624, I saw TestFileAppend3#TC7 occasionally fails. After lots of debug, I saw that the client unexpected received a response of "-2 SUCCESS SUCCESS" in which -2 is the packet sequence number. This happened in a pipeline of 2 datanodes and one of them failed. It turned out when block receiver fails, it shuts down itself and interrupts the packet responder but responder tries to handle interruption with the condition "Thread.isInterrupted()" but unfortunately a thread's interrupt status is not set in some cases as explained in the Thread#interrupt javadoc:

 If this thread is blocked in an invocation of the wait(), wait(long), or wait(long, int) methods of the Object  class, or of the join(), join(long), join(long, int), sleep(long), or sleep(long, int), methods of this class, then its interrupt status will be cleared and it will receive an InterruptedException. 

So datanode does not detect the interruption and continues as if no error occurs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HDFS-637) DataNode sends a Success ack when block write fails

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HDFS-637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang resolved HDFS-637.
--------------------------------

    Resolution: Fixed

I've just committed this.

> DataNode sends a Success ack when block write fails
> ---------------------------------------------------
>
>                 Key: HDFS-637
>                 URL: https://issues.apache.org/jira/browse/HDFS-637
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: interrupted.patch, interrupted1.patch
>
>
> When I work on HDFS-624, I saw TestFileAppend3#TC7 occasionally fails. After lots of debug, I saw that the client unexpected received a response of "-2 SUCCESS SUCCESS" in which -2 is the packet sequence number. This happened in a pipeline of 2 datanodes and one of them failed. It turned out when block receiver fails, it shuts down itself and interrupts the packet responder but responder tries to handle interruption with the condition "Thread.isInterrupted()" but unfortunately a thread's interrupt status is not set in some cases as explained in the Thread#interrupt javadoc:
>  If this thread is blocked in an invocation of the wait(), wait(long), or wait(long, int) methods of the Object  class, or of the join(), join(long), join(long, int), sleep(long), or sleep(long, int), methods of this class, then its interrupt status will be cleared and it will receive an InterruptedException. 
> So datanode does not detect the interruption and continues as if no error occurs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.