You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Qin Gao <qi...@cs.cmu.edu> on 2009/06/22 21:15:46 UTC

Making sure the tmp directory is cleaned?

Hi All,

Do you know if the tmp directory on every map/reduce task will be deleted
automatically after the map task finishes or will do I have to delete them?

I mean the tmp directory that automatically created by on current directory.

Thanks a lot
--Q

Re: Making sure the tmp directory is cleaned?

Posted by Qin Gao <qi...@cs.cmu.edu>.
Thanks, then I will try keep a log on the files and clean them out, thanks.
--Q


On Mon, Jun 22, 2009 at 4:34 PM, Pankil Doshi <fo...@gmail.com> wrote:

> No..If your job gets killed or failed.Temp wont clean up.. and In that case
> you will have to carefully clean that on your own. If you dont clean it up
> yourself it will eat up your disk space.
>
> Pankil
>
> On Mon, Jun 22, 2009 at 4:24 PM, Qin Gao <qi...@cs.cmu.edu> wrote:
>
> > Thanks!
> >
> > But what if the jobs get killed or failed? Does hadoop try to clean it?
> we
> > are considering bad situations - if job gets killed, will the tmp dirs
> sit
> > on local disks forever and eats up all the diskspace?
> >
> > I guess this should be considered in distributed cache, but those files
> are
> > read-only, and our program will generate new temporary files.
> >
> >
> > --Q
> >
> >
> > On Mon, Jun 22, 2009 at 4:19 PM, Pankil Doshi <fo...@gmail.com>
> wrote:
> >
> > > Yes, If your job gets completed successfully .possibly it removes after
> > > completion of both map and reduce tasks.
> > >
> > > Pankil
> > >
> > > On Mon, Jun 22, 2009 at 3:15 PM, Qin Gao <qi...@cs.cmu.edu> wrote:
> > >
> > > > Hi All,
> > > >
> > > > Do you know if the tmp directory on every map/reduce task will be
> > deleted
> > > > automatically after the map task finishes or will do I have to delete
> > > them?
> > > >
> > > > I mean the tmp directory that automatically created by on current
> > > > directory.
> > > >
> > > > Thanks a lot
> > > > --Q
> > > >
> > >
> >
>

Re: Making sure the tmp directory is cleaned?

Posted by Pankil Doshi <fo...@gmail.com>.
No..If your job gets killed or failed.Temp wont clean up.. and In that case
you will have to carefully clean that on your own. If you dont clean it up
yourself it will eat up your disk space.

Pankil

On Mon, Jun 22, 2009 at 4:24 PM, Qin Gao <qi...@cs.cmu.edu> wrote:

> Thanks!
>
> But what if the jobs get killed or failed? Does hadoop try to clean it? we
> are considering bad situations - if job gets killed, will the tmp dirs sit
> on local disks forever and eats up all the diskspace?
>
> I guess this should be considered in distributed cache, but those files are
> read-only, and our program will generate new temporary files.
>
>
> --Q
>
>
> On Mon, Jun 22, 2009 at 4:19 PM, Pankil Doshi <fo...@gmail.com> wrote:
>
> > Yes, If your job gets completed successfully .possibly it removes after
> > completion of both map and reduce tasks.
> >
> > Pankil
> >
> > On Mon, Jun 22, 2009 at 3:15 PM, Qin Gao <qi...@cs.cmu.edu> wrote:
> >
> > > Hi All,
> > >
> > > Do you know if the tmp directory on every map/reduce task will be
> deleted
> > > automatically after the map task finishes or will do I have to delete
> > them?
> > >
> > > I mean the tmp directory that automatically created by on current
> > > directory.
> > >
> > > Thanks a lot
> > > --Q
> > >
> >
>

Re: Making sure the tmp directory is cleaned?

Posted by Qin Gao <qi...@cs.cmu.edu>.
Thanks!

But what if the jobs get killed or failed? Does hadoop try to clean it? we
are considering bad situations - if job gets killed, will the tmp dirs sit
on local disks forever and eats up all the diskspace?

I guess this should be considered in distributed cache, but those files are
read-only, and our program will generate new temporary files.


--Q


On Mon, Jun 22, 2009 at 4:19 PM, Pankil Doshi <fo...@gmail.com> wrote:

> Yes, If your job gets completed successfully .possibly it removes after
> completion of both map and reduce tasks.
>
> Pankil
>
> On Mon, Jun 22, 2009 at 3:15 PM, Qin Gao <qi...@cs.cmu.edu> wrote:
>
> > Hi All,
> >
> > Do you know if the tmp directory on every map/reduce task will be deleted
> > automatically after the map task finishes or will do I have to delete
> them?
> >
> > I mean the tmp directory that automatically created by on current
> > directory.
> >
> > Thanks a lot
> > --Q
> >
>

Re: Making sure the tmp directory is cleaned?

Posted by Pankil Doshi <fo...@gmail.com>.
Yes, If your job gets completed successfully .possibly it removes after
completion of both map and reduce tasks.

Pankil

On Mon, Jun 22, 2009 at 3:15 PM, Qin Gao <qi...@cs.cmu.edu> wrote:

> Hi All,
>
> Do you know if the tmp directory on every map/reduce task will be deleted
> automatically after the map task finishes or will do I have to delete them?
>
> I mean the tmp directory that automatically created by on current
> directory.
>
> Thanks a lot
> --Q
>

Re: Making sure the tmp directory is cleaned?

Posted by Allen Wittenauer <aw...@yahoo-inc.com>.


On 6/22/09 12:15 PM, "Qin Gao" <qi...@cs.cmu.edu> wrote:
> Do you know if the tmp directory on every map/reduce task will be deleted
> automatically after the map task finishes or will do I have to delete them?
> 
> I mean the tmp directory that automatically created by on current directory.

Past experience says that users will find writable space on nodes and fill
it, regardless of what Hadoop may do to try and keep it clean.  It is a good
idea to just wipe those spaces clean during hadoop upgrades and other
planned downtimes.