You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Pallavi Palleti <pa...@corp.aol.com> on 2010/03/30 13:31:22 UTC
Query over DFSClient
Hi,
Could some one kindly let me know if the DFSClient takes care of
datanode failures and attempt to write to another datanode if primary
datanode (and replicated datanodes) fail. I looked into the souce code
of DFSClient and figured out that it attempts to write to one of the
datanodes in pipeline and fails if it failed to write to at least one of
them. However, I am not sure as I haven't explored fully. If so, is
there a way of querying namenode to provide different datanodes in the
case of failure. I am sure the Mapper would be doing similar
thing(attempting to fetch different datanode from namenode) if it fails
to write to datanodes. Kindly let me know.
Thanks
Pallavi
Re: Query over DFSClient
Posted by Hairong Kuang <ha...@yahoo-inc.com>.
> However, I couldn't figure out where this exception is thrown.
After lastException is set, the exception is thrown when next
FSOutputStream.write or close is called.
Hairong
On 3/31/10 6:01 AM, "Ankur C. Goel" <ga...@yahoo-inc.com> wrote:
> Pallavi,
> If a DFSClient is not able to write a block to any of the
> datanodes given by namenode it retries N times before aborting.
> See - https://issues.apache.org/jira/browse/HDFS-167
> This should be handled by the application as this indicates that something is
> seriously wrong with your cluster
>
> Hope this helps
> -@nkur
>
> On 3/31/10 4:59 PM, "Pallavi Palleti" <pa...@corp.aol.com> wrote:
>
> Hi,
>
> I am looking into hadoop-20 source code for below issue. From DFSClient,
> I could see that once the datanodes given by namenode are not reachable,
> it is setting "lastException" variable to error message saying "recovery
> from primary datanode is failed N times, aborting.."(line No:2546 in
> processDataNodeError). However, I couldn't figure out where this
> exception is thrown. I could see the throw statement in isClosed() but
> not finding the exact sequence after Streamer exits with lastException
> set to isClosed() method call. It would be great if some one could shed
> some light on this. I am essentially looking whether DFSClient
> approaches namenode in the case of failure of all datanodes that
> namenode has given for a given data block previously.
>
> Thanks
> Pallavi
>
>
> On 03/30/2010 05:01 PM, Pallavi Palleti wrote:
>> Hi,
>>
>> Could some one kindly let me know if the DFSClient takes care of
>> datanode failures and attempt to write to another datanode if primary
>> datanode (and replicated datanodes) fail. I looked into the souce code
>> of DFSClient and figured out that it attempts to write to one of the
>> datanodes in pipeline and fails if it failed to write to at least one
>> of them. However, I am not sure as I haven't explored fully. If so, is
>> there a way of querying namenode to provide different datanodes in the
>> case of failure. I am sure the Mapper would be doing similar
>> thing(attempting to fetch different datanode from namenode) if it
>> fails to write to datanodes. Kindly let me know.
>>
>> Thanks
>> Pallavi
>>
>
Re: Query over DFSClient
Posted by "Ankur C. Goel" <ga...@yahoo-inc.com>.
Pallavi,
If a DFSClient is not able to write a block to any of the datanodes given by namenode it retries N times before aborting.
See - https://issues.apache.org/jira/browse/HDFS-167
This should be handled by the application as this indicates that something is seriously wrong with your cluster
Hope this helps
-@nkur
On 3/31/10 4:59 PM, "Pallavi Palleti" <pa...@corp.aol.com> wrote:
Hi,
I am looking into hadoop-20 source code for below issue. From DFSClient,
I could see that once the datanodes given by namenode are not reachable,
it is setting "lastException" variable to error message saying "recovery
from primary datanode is failed N times, aborting.."(line No:2546 in
processDataNodeError). However, I couldn't figure out where this
exception is thrown. I could see the throw statement in isClosed() but
not finding the exact sequence after Streamer exits with lastException
set to isClosed() method call. It would be great if some one could shed
some light on this. I am essentially looking whether DFSClient
approaches namenode in the case of failure of all datanodes that
namenode has given for a given data block previously.
Thanks
Pallavi
On 03/30/2010 05:01 PM, Pallavi Palleti wrote:
> Hi,
>
> Could some one kindly let me know if the DFSClient takes care of
> datanode failures and attempt to write to another datanode if primary
> datanode (and replicated datanodes) fail. I looked into the souce code
> of DFSClient and figured out that it attempts to write to one of the
> datanodes in pipeline and fails if it failed to write to at least one
> of them. However, I am not sure as I haven't explored fully. If so, is
> there a way of querying namenode to provide different datanodes in the
> case of failure. I am sure the Mapper would be doing similar
> thing(attempting to fetch different datanode from namenode) if it
> fails to write to datanodes. Kindly let me know.
>
> Thanks
> Pallavi
>
Re: Query over DFSClient
Posted by Pallavi Palleti <pa...@corp.aol.com>.
Hi,
I am looking into hadoop-20 source code for below issue. From DFSClient,
I could see that once the datanodes given by namenode are not reachable,
it is setting "lastException" variable to error message saying "recovery
from primary datanode is failed N times, aborting.."(line No:2546 in
processDataNodeError). However, I couldn't figure out where this
exception is thrown. I could see the throw statement in isClosed() but
not finding the exact sequence after Streamer exits with lastException
set to isClosed() method call. It would be great if some one could shed
some light on this. I am essentially looking whether DFSClient
approaches namenode in the case of failure of all datanodes that
namenode has given for a given data block previously.
Thanks
Pallavi
On 03/30/2010 05:01 PM, Pallavi Palleti wrote:
> Hi,
>
> Could some one kindly let me know if the DFSClient takes care of
> datanode failures and attempt to write to another datanode if primary
> datanode (and replicated datanodes) fail. I looked into the souce code
> of DFSClient and figured out that it attempts to write to one of the
> datanodes in pipeline and fails if it failed to write to at least one
> of them. However, I am not sure as I haven't explored fully. If so, is
> there a way of querying namenode to provide different datanodes in the
> case of failure. I am sure the Mapper would be doing similar
> thing(attempting to fetch different datanode from namenode) if it
> fails to write to datanodes. Kindly let me know.
>
> Thanks
> Pallavi
>