You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Deepika Khera <de...@collarity.com> on 2009/08/09 23:37:28 UTC

OutputCommitter for rollbacks?

Hi,

 

I am trying to use the OutputCommitter.cleanupJob() to commit and
rollback my job. The cleanup() method is called whether the job was
successful, killed or failed.

 

I need to know in the cleanupJob(jobContext) method the status of job so
far, especially if it failed or was killed.

 

The issue is that no matter whether the job failed or was
killed/successful, the job status in the committer is "running" (which
makes sense but not what I need). It seems that I can only know the job
status when the entire job (including the cleanup task) has finished. 

 

I see an open JIRA  which is related , but until we have a resolution to
that is there any other  way to achieve this?

 

http://issues.apache.org/jira/browse/HADOOP-6005

 

So, what I need is make the decision in the cleanupJob() whether I
should do  a commit or rollback(Killed/Failed vs Successful).

 

Would appreciate any help on this.

 

Thanks,

Deepika


Re: OutputCommitter for rollbacks?

Posted by Amareshwari Sriramadasu <am...@yahoo-inc.com>.
Deepika Khera wrote:
> Thanks Amareshwari for your response.
>
> It seems like a good idea to use the map progress & reduce progress. My
> only concern is that in the web interface(jobdetails.jsp) , we see some
> of our jobs show 100% map & 100% reduce, while the reduce still seems to
> be running(Not sure but maybe it's just a UI thing). 
>
> But I guess if the job has reached its cleanup, we should be able to
> trust these numbers(map progress & reduce progress) and make the call on
> commit & rollback?
>
>   
Yes. You can assume that when cleanup is running and reduce progress is 
100%, job is successful.
> Thanks again,
> Deepika
>
>
> -----Original Message-----
> From: Amareshwari Sriramadasu [mailto:amarsri@yahoo-inc.com] 
> Sent: Sunday, August 09, 2009 9:05 PM
> To: common-user@hadoop.apache.org
> Subject: Re: OutputCommitter for rollbacks?
>
> Hi Deepika,
>
> You can use the fact that map progress and reduce progress 1.0 for 
> succeeded jobs and is <1.0 for failed or killed jobs.
> Hope this helps.
>
> Thanks
> Amareshwari
>
> Deepika Khera wrote:
>   
>> Hi,
>>
>>  
>>
>> I am trying to use the OutputCommitter.cleanupJob() to commit and
>> rollback my job. The cleanup() method is called whether the job was
>> successful, killed or failed.
>>
>>  
>>
>> I need to know in the cleanupJob(jobContext) method the status of job
>>     
> so
>   
>> far, especially if it failed or was killed.
>>
>>  
>>
>> The issue is that no matter whether the job failed or was
>> killed/successful, the job status in the committer is "running" (which
>> makes sense but not what I need). It seems that I can only know the
>>     
> job
>   
>> status when the entire job (including the cleanup task) has finished. 
>>
>>  
>>
>> I see an open JIRA  which is related , but until we have a resolution
>>     
> to
>   
>> that is there any other  way to achieve this?
>>
>>  
>>
>> http://issues.apache.org/jira/browse/HADOOP-6005
>>
>>  
>>
>> So, what I need is make the decision in the cleanupJob() whether I
>> should do  a commit or rollback(Killed/Failed vs Successful).
>>
>>  
>>
>> Would appreciate any help on this.
>>
>>  
>>
>> Thanks,
>>
>> Deepika
>>
>>
>>   
>>     
>
>   


RE: OutputCommitter for rollbacks?

Posted by Deepika Khera <de...@collarity.com>.
Thanks Amareshwari for your response.

It seems like a good idea to use the map progress & reduce progress. My
only concern is that in the web interface(jobdetails.jsp) , we see some
of our jobs show 100% map & 100% reduce, while the reduce still seems to
be running(Not sure but maybe it's just a UI thing). 

But I guess if the job has reached its cleanup, we should be able to
trust these numbers(map progress & reduce progress) and make the call on
commit & rollback?

Thanks again,
Deepika


-----Original Message-----
From: Amareshwari Sriramadasu [mailto:amarsri@yahoo-inc.com] 
Sent: Sunday, August 09, 2009 9:05 PM
To: common-user@hadoop.apache.org
Subject: Re: OutputCommitter for rollbacks?

Hi Deepika,

You can use the fact that map progress and reduce progress 1.0 for 
succeeded jobs and is <1.0 for failed or killed jobs.
Hope this helps.

Thanks
Amareshwari

Deepika Khera wrote:
> Hi,
>
>  
>
> I am trying to use the OutputCommitter.cleanupJob() to commit and
> rollback my job. The cleanup() method is called whether the job was
> successful, killed or failed.
>
>  
>
> I need to know in the cleanupJob(jobContext) method the status of job
so
> far, especially if it failed or was killed.
>
>  
>
> The issue is that no matter whether the job failed or was
> killed/successful, the job status in the committer is "running" (which
> makes sense but not what I need). It seems that I can only know the
job
> status when the entire job (including the cleanup task) has finished. 
>
>  
>
> I see an open JIRA  which is related , but until we have a resolution
to
> that is there any other  way to achieve this?
>
>  
>
> http://issues.apache.org/jira/browse/HADOOP-6005
>
>  
>
> So, what I need is make the decision in the cleanupJob() whether I
> should do  a commit or rollback(Killed/Failed vs Successful).
>
>  
>
> Would appreciate any help on this.
>
>  
>
> Thanks,
>
> Deepika
>
>
>   


Re: OutputCommitter for rollbacks?

Posted by Amareshwari Sriramadasu <am...@yahoo-inc.com>.
Hi Deepika,

You can use the fact that map progress and reduce progress 1.0 for 
succeeded jobs and is <1.0 for failed or killed jobs.
Hope this helps.

Thanks
Amareshwari

Deepika Khera wrote:
> Hi,
>
>  
>
> I am trying to use the OutputCommitter.cleanupJob() to commit and
> rollback my job. The cleanup() method is called whether the job was
> successful, killed or failed.
>
>  
>
> I need to know in the cleanupJob(jobContext) method the status of job so
> far, especially if it failed or was killed.
>
>  
>
> The issue is that no matter whether the job failed or was
> killed/successful, the job status in the committer is "running" (which
> makes sense but not what I need). It seems that I can only know the job
> status when the entire job (including the cleanup task) has finished. 
>
>  
>
> I see an open JIRA  which is related , but until we have a resolution to
> that is there any other  way to achieve this?
>
>  
>
> http://issues.apache.org/jira/browse/HADOOP-6005
>
>  
>
> So, what I need is make the decision in the cleanupJob() whether I
> should do  a commit or rollback(Killed/Failed vs Successful).
>
>  
>
> Would appreciate any help on this.
>
>  
>
> Thanks,
>
> Deepika
>
>
>