You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "He Tianyi (JIRA)" <ji...@apache.org> on 2016/06/06 03:56:59 UTC
[jira] [Created] (HDFS-10490) Client may never recovery replica after a timeout during sending packet

He Tianyi created HDFS-10490:
--------------------------------

             Summary: Client may never recovery replica after a timeout during sending packet
                 Key: HDFS-10490
                 URL: https://issues.apache.org/jira/browse/HDFS-10490
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: datanode
    Affects Versions: 2.6.0
            Reporter: He Tianyi


For newly created replica, a meta file is created in constructor of {{BlockReceiver}} (for {{WRITE_BLOCK}} op). Its header will be written lazily (buffered in memory first by {{BufferedOutputStream}}). 
If following packets fail to deliver (e.g. in  extreme network condition), the header may never get flush until closed. 
However, {{BlockReceiver}} will not call close until block receiving is finished or exception(s) encountered. Also in extreme network condition, both RST & FIN may not deliver in time. 

In this case, if client tries to initiates a {{transferBlock}} to a new datanode (in {{addDatanode2ExistingPipeline}}), existing datanode will see an empty meta if its {{BlockReceiver}} did not close in time. 
Then, after HDFS-3429, a default {{DataChecksum}} (NULL, 512) will be used during transfer. So when client then tries to recover pipeline after completely transferred, it may encounter the following exception:
{noformat}
java.io.IOException: Client requested checksum DataChecksum(type=CRC32C, chunkSize=4096) when appending to an existing block with different chunk size: DataChecksum(type=NULL, chunkSize=512)
        at org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.createStreams(ReplicaInPipeline.java:230)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:226)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:798)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:166)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:76)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:243)
        at java.lang.Thread.run(Thread.java:745)
{noformat}
This will repeat, until exhausted by datanode replacement policy.

Also to note that, with bad luck (like I), 20k clients are all doing this. It's to some extend a DDoS attack to NameNode (because of getAdditionalDataNode calls).

I suggest we flush immediately after header is written, preventing anybody from seeing empty meta file for avoiding the issue.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org