You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by zhang jianfeng <zj...@gmail.com> on 2009/08/25 02:49:04 UTC

Does hadoop delete the intermediate data

Hi all,

I found my cluster’s space usage increase over time although I did not
upload new data.  And there's a lot of files under folder /tmp .

So I guess hadoop won’t delete the intermediate data(output of mapper).

Am I right ?


Thank you.

Jeff zhang

Re: Does hadoop delete the intermediate data

Posted by Jim Twensky <ji...@gmail.com>.
Hi Jeff,

The problem may also be related to the large log files if you use the
cluster for too many jobs. Check out your hadoop log directory and see
how big it is. You can decrease the maximum size of a log file using
one of the hadoop configuration files under conf.

Jim

On Mon, Aug 31, 2009 at 2:19 AM, Chandraprakash
Bhagtani<cp...@gmail.com> wrote:
> Hadoop does delete the intermediate data after the job completes.
> Jobtracker sends signal to Tasktracker to delete intermediate data
> when the job completes.
>
> The problem in your case might be some of your running job might not
> have been killed gracefully or Jobtracker failed for some reason.
>
> --
> Thanks & Regards,
> Chandra Prakash Bhagtani,
>
> On Tue, Aug 25, 2009 at 6:19 AM, zhang jianfeng <zj...@gmail.com> wrote:
>
>> Hi all,
>>
>> I found my cluster’s space usage increase over time although I did not
>> upload new data.  And there's a lot of files under folder /tmp .
>>
>> So I guess hadoop won’t delete the intermediate data(output of mapper).
>>
>> Am I right ?
>>
>>
>> Thank you.
>>
>> Jeff zhang
>>
>

Re: Does hadoop delete the intermediate data

Posted by Chandraprakash Bhagtani <cp...@gmail.com>.
Hadoop does delete the intermediate data after the job completes.
Jobtracker sends signal to Tasktracker to delete intermediate data
when the job completes.

The problem in your case might be some of your running job might not
have been killed gracefully or Jobtracker failed for some reason.

-- 
Thanks & Regards,
Chandra Prakash Bhagtani,

On Tue, Aug 25, 2009 at 6:19 AM, zhang jianfeng <zj...@gmail.com> wrote:

> Hi all,
>
> I found my cluster’s space usage increase over time although I did not
> upload new data.  And there's a lot of files under folder /tmp .
>
> So I guess hadoop won’t delete the intermediate data(output of mapper).
>
> Am I right ?
>
>
> Thank you.
>
> Jeff zhang
>