You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Sangmin Lee <sa...@gmail.com> on 2009/05/20 06:59:52 UTC

Question regarding HDFS Recovery

Hi all,

I have some question regarding the hdfs recovery mechanism.

I see that INodeFileUnderConstruction has a "targets" field that stores list
of datanodes which store its last block.
However, I don't see them being used at all except that
"internalReleaseLease" function uses the length of the datanode list.
Is there any other use of the "target" fields rather than checking its
length?

Could anyone shed some light on this?

Thanks,
Sangmin

Re: Question regarding HDFS Recovery

Posted by Dhruba Borthakur <dh...@gmail.com>.
If the client and all the datanode(s) for a block are dead, then the file is
corrupt. It cannot be recovered and the lease cannot be reclaimed. Is that a
problem?

>  your replication factor is 3 and the NN's minReplication is also 3.

In the current trunk, the NN guarantees to keep blocks replicated only when
a file is closed. So, if you think that replicas could get lost while the
block is being written (i.e. file is not yet closed by the writer), then you
should set minreplciation accordingly. This behaviour could get better in
the future if the client could replace lost replicas, but current code does
not do that.

thanks,
dhruba



On Wed, May 20, 2009 at 11:09 PM, Sangmin Lee <sa...@gmail.com> wrote:

> On Wed, May 20, 2009 at 5:43 PM, Dhruba Borthakur <dh...@gmail.com>
> wrote:
>
> > > What if all datanodes in INodeFileUnderConstruction targets are dead ?
> >
> > If all datanodes in a pipeline are dead, than that file cannot be
> recovered
> > at all. This is expected and most file-systems behave this way when the
> > underlying storage goes bad.
>
>
> Yeah, I understand that. But I don't see how the lease will be removed.
> That is, when the client and all datanodes are dead, I don't see any code
> to
> handle this.
>
> Apart from this, I have another question regarding append.
> Suppose that you are trying to append to a file.
> And your replication factor is 3 and the NN's minReplication is also 3.
> As a part of appending, client asks datanodes (which store the last block)
> to sync but one of them fails.
> The primary DN will do commitBlockSynchronisation with only two DNs.
> (I believe the NN should do something at this point since it will never
> receive enough blockreceived msgs)
> And Client also proceeds with two DNs.
> Then later, when client wants to allocate another block, it will get
> NotReplicatedYetException.
>
> Thanks,
> Sangmin
>
>
>
>
>
> >
> >
> > >I thought generationStamp should be checked when the NN process
> > blockreports from DN,
> >
> > The generation stamp is used to compute the hashCode for a Block object.
> >
> > thanks,
> > dhruba
> >
> >
> > On Wed, May 20, 2009 at 11:58 AM, Sangmin Lee <sa...@gmail.com>
> > wrote:
> >
> > > Dhruba,
> > >
> > > Thanks for the response.
> > > What if all datanodes in INodeFileUnderConstruction.targets are dead ?
> > > I don't see any code to handle this case.
> > >
> > > One other thing I wonder is that when is the generationStamp used by
> the
> > NN
> > > ?
> > > I thought generationStamp should be checked when the NN process block
> > > reports from DN, but I can only see it checks blocks length. Am I
> missing
> > > something here?
> > >
> > > Thanks,
> > > Sangmin
> > >
> > >
> > > On Wed, May 20, 2009 at 12:24 PM, Dhruba Borthakur <dh...@gmail.com>
> > > wrote:
> > >
> > > > The NN has a timer for dead-clients. When the HARD_LIMIT (1 hour)
> > > expires,
> > > > the NN extracts the primary datanode from the
> > > > INodeFileUnderConstruction.targets and asks the primary datanode to
> > > recover
> > > > the lease. At the end of the lease recovery, the primary datanode
> > invokes
> > > > NameNode.commitBlockSynchronisation method, and the lease recovery is
> > > > complete.
> > > >
> > > > hope this helps,
> > > > thanks,
> > > > dhruba
> > > >
> > > >
> > > >
> > > > On Wed, May 20, 2009 at 9:14 AM, Sangmin Lee <sa...@gmail.com>
> > > > wrote:
> > > >
> > > > > I am looking at 0.19.0(or maybe 0.19.1) and 0.20.0.
> > > > > In fact, I am still curious about the case (maybe too much extream
> > > case)
> > > > > where
> > > > > a client open a file, request a block and prematurely dies.
> > > > > Also all datanodes go dead.
> > > > > I don't see how the lease will be recovered or reaped in this case.
> > > > > Don't we need some mechanism that discards the block and removes
> the
> > > > lease
> > > > > after several attempts for lease recovery ?
> > > > >
> > > > > Thanks,
> > > > > Sangmin
> > > > >
> > > > > On Wed, May 20, 2009 at 10:40 AM, Edward J. Yoon <
> > > edwardyoon@apache.org
> > > > > >wrote:
> > > > >
> > > > > > Can I ask what version do you read? You looks reach so deeply
> into
> > > the
> > > > > > architecture of a system...
> > > > > >
> > > > > > On Thu, May 21, 2009 at 12:28 AM, Sangmin Lee <
> > sangmin.dev@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > Okay.. I was going dumb by misreading some source code.
> > > > > > > Please ignore my question regarding this.
> > > > > > > Sorry about this.
> > > > > > >
> > > > > > > Sangmin
> > > > > > >
> > > > > > > On Tue, May 19, 2009 at 11:59 PM, Sangmin Lee <
> > > sangmin.dev@gmail.com
> > > > >
> > > > > > wrote:
> > > > > > >
> > > > > > >> Hi all,
> > > > > > >>
> > > > > > >> I have some question regarding the hdfs recovery mechanism.
> > > > > > >>
> > > > > > >> I see that INodeFileUnderConstruction has a "targets" field
> that
> > > > > stores
> > > > > > >> list of datanodes which store its last block.
> > > > > > >> However, I don't see them being used at all except that
> > > > > > >> "internalReleaseLease" function uses the length of the
> datanode
> > > > list.
> > > > > > >> Is there any other use of the "target" fields rather than
> > checking
> > > > its
> > > > > > >> length?
> > > > > > >>
> > > > > > >> Could anyone shed some light on this?
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> Sangmin
> > > > > > >>
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best Regards, Edward J. Yoon @ NHN, corp.
> > > > > > edwardyoon@apache.org
> > > > > > http://blog.udanax.org
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Question regarding HDFS Recovery

Posted by Sangmin Lee <sa...@gmail.com>.
On Wed, May 20, 2009 at 5:43 PM, Dhruba Borthakur <dh...@gmail.com> wrote:

> > What if all datanodes in INodeFileUnderConstruction targets are dead ?
>
> If all datanodes in a pipeline are dead, than that file cannot be recovered
> at all. This is expected and most file-systems behave this way when the
> underlying storage goes bad.


Yeah, I understand that. But I don't see how the lease will be removed.
That is, when the client and all datanodes are dead, I don't see any code to
handle this.

Apart from this, I have another question regarding append.
Suppose that you are trying to append to a file.
And your replication factor is 3 and the NN's minReplication is also 3.
As a part of appending, client asks datanodes (which store the last block)
to sync but one of them fails.
The primary DN will do commitBlockSynchronisation with only two DNs.
(I believe the NN should do something at this point since it will never
receive enough blockreceived msgs)
And Client also proceeds with two DNs.
Then later, when client wants to allocate another block, it will get
NotReplicatedYetException.

Thanks,
Sangmin





>
>
> >I thought generationStamp should be checked when the NN process
> blockreports from DN,
>
> The generation stamp is used to compute the hashCode for a Block object.
>
> thanks,
> dhruba
>
>
> On Wed, May 20, 2009 at 11:58 AM, Sangmin Lee <sa...@gmail.com>
> wrote:
>
> > Dhruba,
> >
> > Thanks for the response.
> > What if all datanodes in INodeFileUnderConstruction.targets are dead ?
> > I don't see any code to handle this case.
> >
> > One other thing I wonder is that when is the generationStamp used by the
> NN
> > ?
> > I thought generationStamp should be checked when the NN process block
> > reports from DN, but I can only see it checks blocks length. Am I missing
> > something here?
> >
> > Thanks,
> > Sangmin
> >
> >
> > On Wed, May 20, 2009 at 12:24 PM, Dhruba Borthakur <dh...@gmail.com>
> > wrote:
> >
> > > The NN has a timer for dead-clients. When the HARD_LIMIT (1 hour)
> > expires,
> > > the NN extracts the primary datanode from the
> > > INodeFileUnderConstruction.targets and asks the primary datanode to
> > recover
> > > the lease. At the end of the lease recovery, the primary datanode
> invokes
> > > NameNode.commitBlockSynchronisation method, and the lease recovery is
> > > complete.
> > >
> > > hope this helps,
> > > thanks,
> > > dhruba
> > >
> > >
> > >
> > > On Wed, May 20, 2009 at 9:14 AM, Sangmin Lee <sa...@gmail.com>
> > > wrote:
> > >
> > > > I am looking at 0.19.0(or maybe 0.19.1) and 0.20.0.
> > > > In fact, I am still curious about the case (maybe too much extream
> > case)
> > > > where
> > > > a client open a file, request a block and prematurely dies.
> > > > Also all datanodes go dead.
> > > > I don't see how the lease will be recovered or reaped in this case.
> > > > Don't we need some mechanism that discards the block and removes the
> > > lease
> > > > after several attempts for lease recovery ?
> > > >
> > > > Thanks,
> > > > Sangmin
> > > >
> > > > On Wed, May 20, 2009 at 10:40 AM, Edward J. Yoon <
> > edwardyoon@apache.org
> > > > >wrote:
> > > >
> > > > > Can I ask what version do you read? You looks reach so deeply into
> > the
> > > > > architecture of a system...
> > > > >
> > > > > On Thu, May 21, 2009 at 12:28 AM, Sangmin Lee <
> sangmin.dev@gmail.com
> > >
> > > > > wrote:
> > > > > > Okay.. I was going dumb by misreading some source code.
> > > > > > Please ignore my question regarding this.
> > > > > > Sorry about this.
> > > > > >
> > > > > > Sangmin
> > > > > >
> > > > > > On Tue, May 19, 2009 at 11:59 PM, Sangmin Lee <
> > sangmin.dev@gmail.com
> > > >
> > > > > wrote:
> > > > > >
> > > > > >> Hi all,
> > > > > >>
> > > > > >> I have some question regarding the hdfs recovery mechanism.
> > > > > >>
> > > > > >> I see that INodeFileUnderConstruction has a "targets" field that
> > > > stores
> > > > > >> list of datanodes which store its last block.
> > > > > >> However, I don't see them being used at all except that
> > > > > >> "internalReleaseLease" function uses the length of the datanode
> > > list.
> > > > > >> Is there any other use of the "target" fields rather than
> checking
> > > its
> > > > > >> length?
> > > > > >>
> > > > > >> Could anyone shed some light on this?
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Sangmin
> > > > > >>
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best Regards, Edward J. Yoon @ NHN, corp.
> > > > > edwardyoon@apache.org
> > > > > http://blog.udanax.org
> > > > >
> > > >
> > >
> >
>

Re: Question regarding HDFS Recovery

Posted by Dhruba Borthakur <dh...@gmail.com>.
> What if all datanodes in INodeFileUnderConstruction targets are dead ?

If all datanodes in a pipeline are dead, than that file cannot be recovered
at all. This is expected and most file-systems behave this way when the
underlying storage goes bad.

>I thought generationStamp should be checked when the NN process
blockreports from DN,

The generation stamp is used to compute the hashCode for a Block object.

thanks,
dhruba


On Wed, May 20, 2009 at 11:58 AM, Sangmin Lee <sa...@gmail.com> wrote:

> Dhruba,
>
> Thanks for the response.
> What if all datanodes in INodeFileUnderConstruction.targets are dead ?
> I don't see any code to handle this case.
>
> One other thing I wonder is that when is the generationStamp used by the NN
> ?
> I thought generationStamp should be checked when the NN process block
> reports from DN, but I can only see it checks blocks length. Am I missing
> something here?
>
> Thanks,
> Sangmin
>
>
> On Wed, May 20, 2009 at 12:24 PM, Dhruba Borthakur <dh...@gmail.com>
> wrote:
>
> > The NN has a timer for dead-clients. When the HARD_LIMIT (1 hour)
> expires,
> > the NN extracts the primary datanode from the
> > INodeFileUnderConstruction.targets and asks the primary datanode to
> recover
> > the lease. At the end of the lease recovery, the primary datanode invokes
> > NameNode.commitBlockSynchronisation method, and the lease recovery is
> > complete.
> >
> > hope this helps,
> > thanks,
> > dhruba
> >
> >
> >
> > On Wed, May 20, 2009 at 9:14 AM, Sangmin Lee <sa...@gmail.com>
> > wrote:
> >
> > > I am looking at 0.19.0(or maybe 0.19.1) and 0.20.0.
> > > In fact, I am still curious about the case (maybe too much extream
> case)
> > > where
> > > a client open a file, request a block and prematurely dies.
> > > Also all datanodes go dead.
> > > I don't see how the lease will be recovered or reaped in this case.
> > > Don't we need some mechanism that discards the block and removes the
> > lease
> > > after several attempts for lease recovery ?
> > >
> > > Thanks,
> > > Sangmin
> > >
> > > On Wed, May 20, 2009 at 10:40 AM, Edward J. Yoon <
> edwardyoon@apache.org
> > > >wrote:
> > >
> > > > Can I ask what version do you read? You looks reach so deeply into
> the
> > > > architecture of a system...
> > > >
> > > > On Thu, May 21, 2009 at 12:28 AM, Sangmin Lee <sangmin.dev@gmail.com
> >
> > > > wrote:
> > > > > Okay.. I was going dumb by misreading some source code.
> > > > > Please ignore my question regarding this.
> > > > > Sorry about this.
> > > > >
> > > > > Sangmin
> > > > >
> > > > > On Tue, May 19, 2009 at 11:59 PM, Sangmin Lee <
> sangmin.dev@gmail.com
> > >
> > > > wrote:
> > > > >
> > > > >> Hi all,
> > > > >>
> > > > >> I have some question regarding the hdfs recovery mechanism.
> > > > >>
> > > > >> I see that INodeFileUnderConstruction has a "targets" field that
> > > stores
> > > > >> list of datanodes which store its last block.
> > > > >> However, I don't see them being used at all except that
> > > > >> "internalReleaseLease" function uses the length of the datanode
> > list.
> > > > >> Is there any other use of the "target" fields rather than checking
> > its
> > > > >> length?
> > > > >>
> > > > >> Could anyone shed some light on this?
> > > > >>
> > > > >> Thanks,
> > > > >> Sangmin
> > > > >>
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best Regards, Edward J. Yoon @ NHN, corp.
> > > > edwardyoon@apache.org
> > > > http://blog.udanax.org
> > > >
> > >
> >
>

Re: Question regarding HDFS Recovery

Posted by Sangmin Lee <sa...@gmail.com>.
Dhruba,

Thanks for the response.
What if all datanodes in INodeFileUnderConstruction.targets are dead ?
I don't see any code to handle this case.

One other thing I wonder is that when is the generationStamp used by the NN
?
I thought generationStamp should be checked when the NN process block
reports from DN, but I can only see it checks blocks length. Am I missing
something here?

Thanks,
Sangmin


On Wed, May 20, 2009 at 12:24 PM, Dhruba Borthakur <dh...@gmail.com> wrote:

> The NN has a timer for dead-clients. When the HARD_LIMIT (1 hour) expires,
> the NN extracts the primary datanode from the
> INodeFileUnderConstruction.targets and asks the primary datanode to recover
> the lease. At the end of the lease recovery, the primary datanode invokes
> NameNode.commitBlockSynchronisation method, and the lease recovery is
> complete.
>
> hope this helps,
> thanks,
> dhruba
>
>
>
> On Wed, May 20, 2009 at 9:14 AM, Sangmin Lee <sa...@gmail.com>
> wrote:
>
> > I am looking at 0.19.0(or maybe 0.19.1) and 0.20.0.
> > In fact, I am still curious about the case (maybe too much extream case)
> > where
> > a client open a file, request a block and prematurely dies.
> > Also all datanodes go dead.
> > I don't see how the lease will be recovered or reaped in this case.
> > Don't we need some mechanism that discards the block and removes the
> lease
> > after several attempts for lease recovery ?
> >
> > Thanks,
> > Sangmin
> >
> > On Wed, May 20, 2009 at 10:40 AM, Edward J. Yoon <edwardyoon@apache.org
> > >wrote:
> >
> > > Can I ask what version do you read? You looks reach so deeply into the
> > > architecture of a system...
> > >
> > > On Thu, May 21, 2009 at 12:28 AM, Sangmin Lee <sa...@gmail.com>
> > > wrote:
> > > > Okay.. I was going dumb by misreading some source code.
> > > > Please ignore my question regarding this.
> > > > Sorry about this.
> > > >
> > > > Sangmin
> > > >
> > > > On Tue, May 19, 2009 at 11:59 PM, Sangmin Lee <sangmin.dev@gmail.com
> >
> > > wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> I have some question regarding the hdfs recovery mechanism.
> > > >>
> > > >> I see that INodeFileUnderConstruction has a "targets" field that
> > stores
> > > >> list of datanodes which store its last block.
> > > >> However, I don't see them being used at all except that
> > > >> "internalReleaseLease" function uses the length of the datanode
> list.
> > > >> Is there any other use of the "target" fields rather than checking
> its
> > > >> length?
> > > >>
> > > >> Could anyone shed some light on this?
> > > >>
> > > >> Thanks,
> > > >> Sangmin
> > > >>
> > > >
> > >
> > >
> > >
> > > --
> > > Best Regards, Edward J. Yoon @ NHN, corp.
> > > edwardyoon@apache.org
> > > http://blog.udanax.org
> > >
> >
>

Re: Question regarding HDFS Recovery

Posted by Dhruba Borthakur <dh...@gmail.com>.
The NN has a timer for dead-clients. When the HARD_LIMIT (1 hour) expires,
the NN extracts the primary datanode from the
INodeFileUnderConstruction.targets and asks the primary datanode to recover
the lease. At the end of the lease recovery, the primary datanode invokes
NameNode.commitBlockSynchronisation method, and the lease recovery is
complete.

hope this helps,
thanks,
dhruba



On Wed, May 20, 2009 at 9:14 AM, Sangmin Lee <sa...@gmail.com> wrote:

> I am looking at 0.19.0(or maybe 0.19.1) and 0.20.0.
> In fact, I am still curious about the case (maybe too much extream case)
> where
> a client open a file, request a block and prematurely dies.
> Also all datanodes go dead.
> I don't see how the lease will be recovered or reaped in this case.
> Don't we need some mechanism that discards the block and removes the lease
> after several attempts for lease recovery ?
>
> Thanks,
> Sangmin
>
> On Wed, May 20, 2009 at 10:40 AM, Edward J. Yoon <edwardyoon@apache.org
> >wrote:
>
> > Can I ask what version do you read? You looks reach so deeply into the
> > architecture of a system...
> >
> > On Thu, May 21, 2009 at 12:28 AM, Sangmin Lee <sa...@gmail.com>
> > wrote:
> > > Okay.. I was going dumb by misreading some source code.
> > > Please ignore my question regarding this.
> > > Sorry about this.
> > >
> > > Sangmin
> > >
> > > On Tue, May 19, 2009 at 11:59 PM, Sangmin Lee <sa...@gmail.com>
> > wrote:
> > >
> > >> Hi all,
> > >>
> > >> I have some question regarding the hdfs recovery mechanism.
> > >>
> > >> I see that INodeFileUnderConstruction has a "targets" field that
> stores
> > >> list of datanodes which store its last block.
> > >> However, I don't see them being used at all except that
> > >> "internalReleaseLease" function uses the length of the datanode list.
> > >> Is there any other use of the "target" fields rather than checking its
> > >> length?
> > >>
> > >> Could anyone shed some light on this?
> > >>
> > >> Thanks,
> > >> Sangmin
> > >>
> > >
> >
> >
> >
> > --
> > Best Regards, Edward J. Yoon @ NHN, corp.
> > edwardyoon@apache.org
> > http://blog.udanax.org
> >
>

Re: Question regarding HDFS Recovery

Posted by Sangmin Lee <sa...@gmail.com>.
I am looking at 0.19.0(or maybe 0.19.1) and 0.20.0.
In fact, I am still curious about the case (maybe too much extream case)
where
a client open a file, request a block and prematurely dies.
Also all datanodes go dead.
I don't see how the lease will be recovered or reaped in this case.
Don't we need some mechanism that discards the block and removes the lease
after several attempts for lease recovery ?

Thanks,
Sangmin

On Wed, May 20, 2009 at 10:40 AM, Edward J. Yoon <ed...@apache.org>wrote:

> Can I ask what version do you read? You looks reach so deeply into the
> architecture of a system...
>
> On Thu, May 21, 2009 at 12:28 AM, Sangmin Lee <sa...@gmail.com>
> wrote:
> > Okay.. I was going dumb by misreading some source code.
> > Please ignore my question regarding this.
> > Sorry about this.
> >
> > Sangmin
> >
> > On Tue, May 19, 2009 at 11:59 PM, Sangmin Lee <sa...@gmail.com>
> wrote:
> >
> >> Hi all,
> >>
> >> I have some question regarding the hdfs recovery mechanism.
> >>
> >> I see that INodeFileUnderConstruction has a "targets" field that stores
> >> list of datanodes which store its last block.
> >> However, I don't see them being used at all except that
> >> "internalReleaseLease" function uses the length of the datanode list.
> >> Is there any other use of the "target" fields rather than checking its
> >> length?
> >>
> >> Could anyone shed some light on this?
> >>
> >> Thanks,
> >> Sangmin
> >>
> >
>
>
>
> --
> Best Regards, Edward J. Yoon @ NHN, corp.
> edwardyoon@apache.org
> http://blog.udanax.org
>

Re: Question regarding HDFS Recovery

Posted by "Edward J. Yoon" <ed...@apache.org>.
Can I ask what version do you read? You looks reach so deeply into the
architecture of a system...

On Thu, May 21, 2009 at 12:28 AM, Sangmin Lee <sa...@gmail.com> wrote:
> Okay.. I was going dumb by misreading some source code.
> Please ignore my question regarding this.
> Sorry about this.
>
> Sangmin
>
> On Tue, May 19, 2009 at 11:59 PM, Sangmin Lee <sa...@gmail.com> wrote:
>
>> Hi all,
>>
>> I have some question regarding the hdfs recovery mechanism.
>>
>> I see that INodeFileUnderConstruction has a "targets" field that stores
>> list of datanodes which store its last block.
>> However, I don't see them being used at all except that
>> "internalReleaseLease" function uses the length of the datanode list.
>> Is there any other use of the "target" fields rather than checking its
>> length?
>>
>> Could anyone shed some light on this?
>>
>> Thanks,
>> Sangmin
>>
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Re: Question regarding HDFS Recovery

Posted by Sangmin Lee <sa...@gmail.com>.
Okay.. I was going dumb by misreading some source code.
Please ignore my question regarding this.
Sorry about this.

Sangmin

On Tue, May 19, 2009 at 11:59 PM, Sangmin Lee <sa...@gmail.com> wrote:

> Hi all,
>
> I have some question regarding the hdfs recovery mechanism.
>
> I see that INodeFileUnderConstruction has a "targets" field that stores
> list of datanodes which store its last block.
> However, I don't see them being used at all except that
> "internalReleaseLease" function uses the length of the datanode list.
> Is there any other use of the "target" fields rather than checking its
> length?
>
> Could anyone shed some light on this?
>
> Thanks,
> Sangmin
>