You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Taeho Kang <tk...@gmail.com> on 2008/03/19 11:12:59 UTC

Trash option in hadoop-site.xml configuration.

Hello,

I have these two machines that acts as a client to HDFS.

Node #1 has Trash option enabled (e.g. fs.trash.interval set to 60)
and Node #2 has Trash option off (e.g. fs.trash.interval set to 0)

When I order file deletion from Node #2, the file gets deleted right away.
while the file gets moved to trash when I do the same from Node #1.

This is a bit of surprise to me,
because I thought Trash option that I have set in the master node's config
file
applies to everyone who connects to / uses the HDFS.

Was there any reason why Trash option was implemented in this way?

Thank you in advance,

/Taeho

RE: Trash option in hadoop-site.xml configuration.

Posted by dhruba Borthakur <dh...@yahoo-inc.com>.
Actually, the fs.trash.interval number has no significance on the client. If it is non-zero, then the client does a rename instead of a delete. The value specified in fs.trash.interval is used only by the namenode to periodically remove files from Trash: the periodicity is the value specified by fs.trash.interval on the namenode.

hope this helps,
dhruba


-----Original Message-----
From: Taeho Kang [mailto:tkang1@gmail.com]
Sent: Thu 3/20/2008 1:53 AM
To: core-user@hadoop.apache.org
Subject: Re: Trash option in hadoop-site.xml configuration.
 
Thank you for the clarification.

Here is my another question.
If two different clients ordered "move to trash" with different interval,
(e.g. client #1 with fs.trash.interval = 60; client #2 with
fs.trash.interval = 120)
what would happen?

Does namenode keep track of all these info?

/Taeho


On 3/20/08, dhruba Borthakur <dh...@yahoo-inc.com> wrote:
>
> The "trash" feature is a client side option and depends on the client
> configuration file. If the client's configuration specifies that "Trash"
> is enabled, then the HDFS client invokes a "rename to Trash" instead of
> a "delete". Now, if "Trash" is enabled on the Namenode, then the
> Namenode periodically removes contents from the Trash directory.
>
> This design might be confusing to some users. But it provides the
> flexibility that different clients in the cluster can have either Trash
> enabled or disabled.
>
> Thanks,
> dhruba
>
> -----Original Message-----
> From: Taeho Kang [mailto:tkang1@gmail.com]
> Sent: Wednesday, March 19, 2008 3:13 AM
> To: hadoop-user@lucene.apache.org; core-user@hadoop.apache.org;
> tkang1@gmail.com
> Subject: Trash option in hadoop-site.xml configuration.
>
> Hello,
>
> I have these two machines that acts as a client to HDFS.
>
> Node #1 has Trash option enabled (e.g. fs.trash.interval set to 60)
> and Node #2 has Trash option off (e.g. fs.trash.interval set to 0)
>
> When I order file deletion from Node #2, the file gets deleted right
> away.
> while the file gets moved to trash when I do the same from Node #1.
>
> This is a bit of surprise to me,
> because I thought Trash option that I have set in the master node's
> config
> file
> applies to everyone who connects to / uses the HDFS.
>
> Was there any reason why Trash option was implemented in this way?
>
> Thank you in advance,
>
> /Taeho
>


Re: Trash option in hadoop-site.xml configuration.

Posted by Taeho Kang <tk...@gmail.com>.
Thank you for the clarification.

Here is my another question.
If two different clients ordered "move to trash" with different interval,
(e.g. client #1 with fs.trash.interval = 60; client #2 with
fs.trash.interval = 120)
what would happen?

Does namenode keep track of all these info?

/Taeho


On 3/20/08, dhruba Borthakur <dh...@yahoo-inc.com> wrote:
>
> The "trash" feature is a client side option and depends on the client
> configuration file. If the client's configuration specifies that "Trash"
> is enabled, then the HDFS client invokes a "rename to Trash" instead of
> a "delete". Now, if "Trash" is enabled on the Namenode, then the
> Namenode periodically removes contents from the Trash directory.
>
> This design might be confusing to some users. But it provides the
> flexibility that different clients in the cluster can have either Trash
> enabled or disabled.
>
> Thanks,
> dhruba
>
> -----Original Message-----
> From: Taeho Kang [mailto:tkang1@gmail.com]
> Sent: Wednesday, March 19, 2008 3:13 AM
> To: hadoop-user@lucene.apache.org; core-user@hadoop.apache.org;
> tkang1@gmail.com
> Subject: Trash option in hadoop-site.xml configuration.
>
> Hello,
>
> I have these two machines that acts as a client to HDFS.
>
> Node #1 has Trash option enabled (e.g. fs.trash.interval set to 60)
> and Node #2 has Trash option off (e.g. fs.trash.interval set to 0)
>
> When I order file deletion from Node #2, the file gets deleted right
> away.
> while the file gets moved to trash when I do the same from Node #1.
>
> This is a bit of surprise to me,
> because I thought Trash option that I have set in the master node's
> config
> file
> applies to everyone who connects to / uses the HDFS.
>
> Was there any reason why Trash option was implemented in this way?
>
> Thank you in advance,
>
> /Taeho
>

RE: Trash option in hadoop-site.xml configuration.

Posted by dhruba Borthakur <dh...@yahoo-inc.com>.
The "trash" feature is a client side option and depends on the client
configuration file. If the client's configuration specifies that "Trash"
is enabled, then the HDFS client invokes a "rename to Trash" instead of
a "delete". Now, if "Trash" is enabled on the Namenode, then the
Namenode periodically removes contents from the Trash directory.

This design might be confusing to some users. But it provides the
flexibility that different clients in the cluster can have either Trash
enabled or disabled.

Thanks,
dhruba

-----Original Message-----
From: Taeho Kang [mailto:tkang1@gmail.com] 
Sent: Wednesday, March 19, 2008 3:13 AM
To: hadoop-user@lucene.apache.org; core-user@hadoop.apache.org;
tkang1@gmail.com
Subject: Trash option in hadoop-site.xml configuration.

Hello,

I have these two machines that acts as a client to HDFS.

Node #1 has Trash option enabled (e.g. fs.trash.interval set to 60)
and Node #2 has Trash option off (e.g. fs.trash.interval set to 0)

When I order file deletion from Node #2, the file gets deleted right
away.
while the file gets moved to trash when I do the same from Node #1.

This is a bit of surprise to me,
because I thought Trash option that I have set in the master node's
config
file
applies to everyone who connects to / uses the HDFS.

Was there any reason why Trash option was implemented in this way?

Thank you in advance,

/Taeho