You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Bill Graham <bi...@gmail.com> on 2010/01/26 19:31:09 UTC

How to cleanup old Job jars

Hi,

Every time I run a Pig script I get a number of Job jars left in the /tmp
directory of my client, 1 per MR job it seems. The file names look like
/tmp/Job875278192.jar.

I have scripts that run every five minutes and fire 10 MR jobs each, so the
amount of space used by these jars grows rapidly. Is there a way to tell Pig
to clean up after itself and remove these jars, or do I need to just write
my own clean-up script?

thanks,
Bill

Re: How to cleanup old Job jars

Posted by Bill Graham <bi...@gmail.com>.
Thanks Rekha.

These issues seem to be related to cleaning up Pig/Hadoop file upon shutdown
of the VM. I just checked and when I shut down the VM, all files are cleaned
up as expected.

My issue is that I have Pig jobs that run in an app server which are
triggered by quartz. It might be days or weeks between app server bounces.
If anyone knows a way to configure or kick off some sort of cleanup process
without shutting down the VM, please let me know.

Otherwise, I need to deploy a hacky crontab script like this:

find /tmp/Job[0-9]*.jar -type f -mmin +50 -exec rm {} \;


On Tue, Jan 26, 2010 at 8:40 PM, Rekha Joshi <re...@yahoo-inc.com> wrote:

>  You might like to check up PIG-116 and HADOOP-5175.Also think there is a
> JobCleanup task which takes care of cleaning.., AFAIK.., unless its failed
> job.
> Cheers,
> /R
>
>
>
> On 1/27/10 12:01 AM, "Bill Graham" <bi...@gmail.com> wrote:
>
> Hi,
>
> Every time I run a Pig script I get a number of Job jars left in the /tmp
> directory of my client, 1 per MR job it seems. The file names look like
> /tmp/Job875278192.jar.
>
> I have scripts that run every five minutes and fire 10 MR jobs each, so the
> amount of space used by these jars grows rapidly. Is there a way to tell
> Pig
> to clean up after itself and remove these jars, or do I need to just write
> my own clean-up script?
>
> thanks,
> Bill
>
>

Re: How to cleanup old Job jars

Posted by Rekha Joshi <re...@yahoo-inc.com>.
You might like to check up PIG-116 and HADOOP-5175.Also think there is a JobCleanup task which takes care of cleaning.., AFAIK.., unless its failed job.
Cheers,
/R


On 1/27/10 12:01 AM, "Bill Graham" <bi...@gmail.com> wrote:

Hi,

Every time I run a Pig script I get a number of Job jars left in the /tmp
directory of my client, 1 per MR job it seems. The file names look like
/tmp/Job875278192.jar.

I have scripts that run every five minutes and fire 10 MR jobs each, so the
amount of space used by these jars grows rapidly. Is there a way to tell Pig
to clean up after itself and remove these jars, or do I need to just write
my own clean-up script?

thanks,
Bill