You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by andreina j <an...@huawei.com> on 2015/05/11 13:05:11 UTC

Is implicit Retry for create() required on AlreadyBeingCreatedException in NonHA ?

Hi ,

In NonHA, If client tries to create the same file, which is already exists and open, then NameNode was throwing already being created exception immediately. For NonHA case, Retry policy was added to retry 5 times, each time after 60  sec(SOFT_LEASE_TIMEOUT).

However this retry was not working until <https://issues.apache.org/jira/browse/HDFS-6478>  <https://issues.apache.org/jira/browse/HDFS-6478> HDFS-6478 fixed this.

This led to behavior change and call will retry upto 5 min, before failing.
Due to this downstream projects are facing issues. HBase reported it in HDFS-8270<https://issues.apache.org/jira/browse/HDFS-8270>.

Now my doubt is, Is retry of upto 5 mins (or a configurable retry time ) for AlreadyBeingCreatedException is really necessary in NonHA case?
Whether waiting for a create operation failure is correct?

Note : In HA case , there is no retry on AlreadyBeingCreatedException.

Please provide your suggestion

Thanks in advance
Andreina J

Re: Is implicit Retry for create() required on AlreadyBeingCreatedException in NonHA ?

Posted by Vinayakumar B <vi...@apache.org>.
Good find, andreina.

I am unaware of the fact, why AlreadyBeingCreatedException was considered
for retry before throwing back the exception.

But I find it strange that only NonHA its being retried. And retry upto 5
min is really un-acceptable.

IMO, keep the behaviour in sync in both HA and Non-HA cases, by removing
this retry.

Any thoughts?

Regards,
Vinay

On Mon, May 11, 2015 at 4:35 PM, andreina j <an...@huawei.com> wrote:

> Hi ,
>
> In NonHA, If client tries to create the same file, which is already exists
> and open, then NameNode was throwing already being created exception
> immediately. For NonHA case, Retry policy was added to retry 5 times, each
> time after 60  sec(SOFT_LEASE_TIMEOUT).
>
> However this retry was not working until <
> https://issues.apache.org/jira/browse/HDFS-6478>  <
> https://issues.apache.org/jira/browse/HDFS-6478> HDFS-6478 fixed this.
>
> This led to behavior change and call will retry upto 5 min, before failing.
> Due to this downstream projects are facing issues. HBase reported it in
> HDFS-8270<https://issues.apache.org/jira/browse/HDFS-8270>.
>
> Now my doubt is, Is retry of upto 5 mins (or a configurable retry time )
> for AlreadyBeingCreatedException is really necessary in NonHA case?
> Whether waiting for a create operation failure is correct?
>
> Note : In HA case , there is no retry on AlreadyBeingCreatedException.
>
> Please provide your suggestion
>
> Thanks in advance
> Andreina J
>