You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by xiaohe lan <zo...@gmail.com> on 2015/03/13 12:55:53 UTC

How the ack sent back to upstream of a pipeline when write data to HDFS

Hi experts,

When HDFS client sends a packet of data to a DN in the pipeline, the packet
will then be sent to the next DN in the pipeline. What confuses me is when
the ack from a DN in the pipeline will be sent back ? In which order ? It
is sent from the last to first or in other ways ?

Thanks,
Xiaohe

Re: How the ack sent back to upstream of a pipeline when write data to HDFS

Posted by Charles Lamb <cl...@cloudera.com>.
On 3/17/2015 10:53 AM, xiaohe lan wrote:
> Hi Charles,
>
> Thanks for pointing me to the doc, it really helps me a lot.
>
> I am confused by another problem when I read DFSOutputStream.java. 
> When packets a block are being sent through the pipeline, why 
> DataStreamer will wait for all acks of the packets are received
> before the last packet is sent ? I see in DataSteamer's run, it will 
> wait for ackQueue's size == 0, then add the last packet to ackQueue, 
> then wait for ackQueue.size == 0 again and finally close the
> responseProcessor and the blockStream.
>
Xiaohe,

I assume you are referring to this code:

           if (one.isLastPacketInBlock()) {
             // wait for all data packets have been successfully acked
             synchronized (dataQueue) {
               while (!streamerClosed && !hasError &&
                   ackQueue.size() != 0 && dfsClient.clientRunning) {
                 try {
                   // wait for acks to arrive from datanodes
                   dataQueue.wait(1000);
                 } catch (InterruptedException  e) {
                   DFSClient.LOG.warn("Caught exception ", e);
                 }
               }
             }
             if (streamerClosed || hasError || !dfsClient.clientRunning) {
               continue;
             }
             stage = BlockConstructionStage.PIPELINE_CLOSE;
           }

This is just making sure that a block is completely written. There may 
be many packets in a block.

Charles


Re: How the ack sent back to upstream of a pipeline when write data to HDFS

Posted by xiaohe lan <zo...@gmail.com>.
Hi Charles,

Thanks for pointing me to the doc, it really helps me a lot.

I am confused by another problem when I read DFSOutputStream.java. When
packets a block are being sent through the pipeline, why DataStreamer will
wait for all acks of the packets are received
before the last packet is sent ? I see in DataSteamer's run, it will wait
for ackQueue's size == 0, then add the last packet to ackQueue, then wait
for ackQueue.size == 0 again and finally close the
responseProcessor and the blockStream.

Thanks,
Xiaohe

On Fri, Mar 13, 2015 at 8:27 PM, Charles Lamb <cl...@cloudera.com> wrote:

> On 3/13/2015 7:55 AM, xiaohe lan wrote:
>
>> Hi experts,
>>
>> When HDFS client sends a packet of data to a DN in the pipeline, the
>> packet
>> will then be sent to the next DN in the pipeline. What confuses me is when
>> the ack from a DN in the pipeline will be sent back ? In which order ? It
>> is sent from the last to first or in other ways ?
>>
>> Thanks,
>> Xiaohe
>>
> Hi Xiaohe,
>
> Take a look at figure 3.2 in https://issues.apache.org/
> jira/secure/attachment/12445209/appendDesign3.pdf.
>
> IHTH.
>
> Charles
>
>

Re: How the ack sent back to upstream of a pipeline when write data to HDFS

Posted by Charles Lamb <cl...@cloudera.com>.
On 3/13/2015 7:55 AM, xiaohe lan wrote:
> Hi experts,
>
> When HDFS client sends a packet of data to a DN in the pipeline, the packet
> will then be sent to the next DN in the pipeline. What confuses me is when
> the ack from a DN in the pipeline will be sent back ? In which order ? It
> is sent from the last to first or in other ways ?
>
> Thanks,
> Xiaohe
Hi Xiaohe,

Take a look at figure 3.2 in 
https://issues.apache.org/jira/secure/attachment/12445209/appendDesign3.pdf.

IHTH.

Charles