You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by xiaohe lan <zo...@gmail.com> on 2015/03/13 12:55:53 UTC
How the ack sent back to upstream of a pipeline when write data to HDFS
Hi experts,
When HDFS client sends a packet of data to a DN in the pipeline, the packet
will then be sent to the next DN in the pipeline. What confuses me is when
the ack from a DN in the pipeline will be sent back ? In which order ? It
is sent from the last to first or in other ways ?
Thanks,
Xiaohe
Re: How the ack sent back to upstream of a pipeline when write data
to HDFS
Posted by Charles Lamb <cl...@cloudera.com>.
On 3/17/2015 10:53 AM, xiaohe lan wrote:
> Hi Charles,
>
> Thanks for pointing me to the doc, it really helps me a lot.
>
> I am confused by another problem when I read DFSOutputStream.java.
> When packets a block are being sent through the pipeline, why
> DataStreamer will wait for all acks of the packets are received
> before the last packet is sent ? I see in DataSteamer's run, it will
> wait for ackQueue's size == 0, then add the last packet to ackQueue,
> then wait for ackQueue.size == 0 again and finally close the
> responseProcessor and the blockStream.
>
Xiaohe,
I assume you are referring to this code:
if (one.isLastPacketInBlock()) {
// wait for all data packets have been successfully acked
synchronized (dataQueue) {
while (!streamerClosed && !hasError &&
ackQueue.size() != 0 && dfsClient.clientRunning) {
try {
// wait for acks to arrive from datanodes
dataQueue.wait(1000);
} catch (InterruptedException e) {
DFSClient.LOG.warn("Caught exception ", e);
}
}
}
if (streamerClosed || hasError || !dfsClient.clientRunning) {
continue;
}
stage = BlockConstructionStage.PIPELINE_CLOSE;
}
This is just making sure that a block is completely written. There may
be many packets in a block.
Charles
Re: How the ack sent back to upstream of a pipeline when write data
to HDFS
Posted by xiaohe lan <zo...@gmail.com>.
Hi Charles,
Thanks for pointing me to the doc, it really helps me a lot.
I am confused by another problem when I read DFSOutputStream.java. When
packets a block are being sent through the pipeline, why DataStreamer will
wait for all acks of the packets are received
before the last packet is sent ? I see in DataSteamer's run, it will wait
for ackQueue's size == 0, then add the last packet to ackQueue, then wait
for ackQueue.size == 0 again and finally close the
responseProcessor and the blockStream.
Thanks,
Xiaohe
On Fri, Mar 13, 2015 at 8:27 PM, Charles Lamb <cl...@cloudera.com> wrote:
> On 3/13/2015 7:55 AM, xiaohe lan wrote:
>
>> Hi experts,
>>
>> When HDFS client sends a packet of data to a DN in the pipeline, the
>> packet
>> will then be sent to the next DN in the pipeline. What confuses me is when
>> the ack from a DN in the pipeline will be sent back ? In which order ? It
>> is sent from the last to first or in other ways ?
>>
>> Thanks,
>> Xiaohe
>>
> Hi Xiaohe,
>
> Take a look at figure 3.2 in https://issues.apache.org/
> jira/secure/attachment/12445209/appendDesign3.pdf.
>
> IHTH.
>
> Charles
>
>
Re: How the ack sent back to upstream of a pipeline when write data
to HDFS
Posted by Charles Lamb <cl...@cloudera.com>.
On 3/13/2015 7:55 AM, xiaohe lan wrote:
> Hi experts,
>
> When HDFS client sends a packet of data to a DN in the pipeline, the packet
> will then be sent to the next DN in the pipeline. What confuses me is when
> the ack from a DN in the pipeline will be sent back ? In which order ? It
> is sent from the last to first or in other ways ?
>
> Thanks,
> Xiaohe
Hi Xiaohe,
Take a look at figure 3.2 in
https://issues.apache.org/jira/secure/attachment/12445209/appendDesign3.pdf.
IHTH.
Charles