You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Mona Chitnis (JIRA)" <ji...@apache.org> on 2014/07/02 21:50:25 UTC

[jira] [Commented] (OOZIE-1913) Devise a way to turn off SLA alerts when bundle/coordinator suspended

    [ https://issues.apache.org/jira/browse/OOZIE-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14050625#comment-14050625 ] 

Mona Chitnis commented on OOZIE-1913:
-------------------------------------

Some discussion points:

h5. Approach 1:
Change SLA behavior for all jobs on suspend. i.e. not track SLA for suspended jobs. However this was originally put into place because users need to be notified of their job SLAs in the event of suspension caused by system (Oozie server restart/ transient errors from Hadoop cluster). So making this change across all suspended jobs would not be ideal.

h5. Approach 2:
Add a command line option like {{-ignoresla}} along with suspend command, which will flag it accordingly in the memory map of the SLA calculator. This then entails two sub-approaches

h6. 2A]
On seeing {{-ignoresla}}, mark the eventProcessed byte of the SLA entry to {{1000 (8) }} to remove it from being tracked anymore for SLA. The resume command will also need an option like {{-resumesla}} to then add this job back into SLA map for tracking, along with more options for revised expected end time and expected duration of job.

h6. 2B]
If we dont wish to change the eventProcessed byte so that we dont have to recalculate it, we can add a flag to the job, to indicate to ignore SLA for this job till unset. However, this requires adding a column to the Sla_Summary table schema to be able to retain this information across Oozie server restarts and in HA mode.

2A seems to be preferable to me. Thoughts?


> Devise a way to turn off SLA alerts when bundle/coordinator suspended
> ---------------------------------------------------------------------
>
>                 Key: OOZIE-1913
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1913
>             Project: Oozie
>          Issue Type: Improvement
>    Affects Versions: trunk
>            Reporter: Mona Chitnis
>            Assignee: Mona Chitnis
>             Fix For: trunk
>
>
> From user:
> Need to turn off the SLA miss alerts in jobs when the bundle is suspended for
> grid upgrades and similar work so that when it's resumed we aren't flooded with a bunch of alerts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)