You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Ken Huang <dn...@gmail.com> on 2016/02/26 02:49:50 UTC

libhdfs force close hdfsFile

Hi,

Does anyone know how to close a hdfsFile while the connection between
hdfsClient and NameNode is lost ?

Thanks
Ken Huang

Re: libhdfs force close hdfsFile

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Ken,

The closest thing to what you're requesting is in the Java API, there is the slightly dodgy, semi-private, we-hope-only-HBase-calls-it method DistributedFileSystem#recoverLease.  This is capable of telling the NameNode to recover the lease (and ultimately close the file if necessary) based on any specified path.  This method is not exposed through libhdfs though, and just so it's clear, I wouldn't recommend using it even if it was.

When I hear questions like this, it's often because an application is writing to a file at a certain path and there is a desire for recoverability if the application terminates prematurely, such as due to a server crash.  Users would like another process to be able to take over right away and start writing to the file again, but the NameNode won't allow this until after expiration of the old client's lease.  Is this the use case you had in mind?

If so, then a pattern that can work well is for the application to create and write to a unique temporary file name instead of the final destination path.  Then, after writing all data, the application renames the temporary file to the desired final destination.  Since the leases are tracked on the file paths being written, the old client's lease on its temporary file won't block the new client from writing to a different temporary file.

--Chris Nauroth

From: Ken Huang <dn...@gmail.com>>
Date: Thursday, February 25, 2016 at 5:49 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: libhdfs force close hdfsFile

Hi,

Does anyone know how to close a hdfsFile while the connection between hdfsClient and NameNode is lost ?

Thanks
Ken Huang

Re: libhdfs force close hdfsFile

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Ken,

The closest thing to what you're requesting is in the Java API, there is the slightly dodgy, semi-private, we-hope-only-HBase-calls-it method DistributedFileSystem#recoverLease.  This is capable of telling the NameNode to recover the lease (and ultimately close the file if necessary) based on any specified path.  This method is not exposed through libhdfs though, and just so it's clear, I wouldn't recommend using it even if it was.

When I hear questions like this, it's often because an application is writing to a file at a certain path and there is a desire for recoverability if the application terminates prematurely, such as due to a server crash.  Users would like another process to be able to take over right away and start writing to the file again, but the NameNode won't allow this until after expiration of the old client's lease.  Is this the use case you had in mind?

If so, then a pattern that can work well is for the application to create and write to a unique temporary file name instead of the final destination path.  Then, after writing all data, the application renames the temporary file to the desired final destination.  Since the leases are tracked on the file paths being written, the old client's lease on its temporary file won't block the new client from writing to a different temporary file.

--Chris Nauroth

From: Ken Huang <dn...@gmail.com>>
Date: Thursday, February 25, 2016 at 5:49 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: libhdfs force close hdfsFile

Hi,

Does anyone know how to close a hdfsFile while the connection between hdfsClient and NameNode is lost ?

Thanks
Ken Huang

Re: libhdfs force close hdfsFile

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Ken,

The closest thing to what you're requesting is in the Java API, there is the slightly dodgy, semi-private, we-hope-only-HBase-calls-it method DistributedFileSystem#recoverLease.  This is capable of telling the NameNode to recover the lease (and ultimately close the file if necessary) based on any specified path.  This method is not exposed through libhdfs though, and just so it's clear, I wouldn't recommend using it even if it was.

When I hear questions like this, it's often because an application is writing to a file at a certain path and there is a desire for recoverability if the application terminates prematurely, such as due to a server crash.  Users would like another process to be able to take over right away and start writing to the file again, but the NameNode won't allow this until after expiration of the old client's lease.  Is this the use case you had in mind?

If so, then a pattern that can work well is for the application to create and write to a unique temporary file name instead of the final destination path.  Then, after writing all data, the application renames the temporary file to the desired final destination.  Since the leases are tracked on the file paths being written, the old client's lease on its temporary file won't block the new client from writing to a different temporary file.

--Chris Nauroth

From: Ken Huang <dn...@gmail.com>>
Date: Thursday, February 25, 2016 at 5:49 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: libhdfs force close hdfsFile

Hi,

Does anyone know how to close a hdfsFile while the connection between hdfsClient and NameNode is lost ?

Thanks
Ken Huang

Re: libhdfs force close hdfsFile

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Ken,

The closest thing to what you're requesting is in the Java API, there is the slightly dodgy, semi-private, we-hope-only-HBase-calls-it method DistributedFileSystem#recoverLease.  This is capable of telling the NameNode to recover the lease (and ultimately close the file if necessary) based on any specified path.  This method is not exposed through libhdfs though, and just so it's clear, I wouldn't recommend using it even if it was.

When I hear questions like this, it's often because an application is writing to a file at a certain path and there is a desire for recoverability if the application terminates prematurely, such as due to a server crash.  Users would like another process to be able to take over right away and start writing to the file again, but the NameNode won't allow this until after expiration of the old client's lease.  Is this the use case you had in mind?

If so, then a pattern that can work well is for the application to create and write to a unique temporary file name instead of the final destination path.  Then, after writing all data, the application renames the temporary file to the desired final destination.  Since the leases are tracked on the file paths being written, the old client's lease on its temporary file won't block the new client from writing to a different temporary file.

--Chris Nauroth

From: Ken Huang <dn...@gmail.com>>
Date: Thursday, February 25, 2016 at 5:49 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: libhdfs force close hdfsFile

Hi,

Does anyone know how to close a hdfsFile while the connection between hdfsClient and NameNode is lost ?

Thanks
Ken Huang