You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Mona Chitnis (JIRA)" <ji...@apache.org> on 2014/07/02 00:15:25 UTC

[jira] [Created] (OOZIE-1911) SLA calculation in HA mode does wrong bit comparison for 'start' and 'duration'

Mona Chitnis created OOZIE-1911:
-----------------------------------

             Summary: SLA calculation in HA mode does wrong bit comparison for 'start' and 'duration'
                 Key: OOZIE-1911
                 URL: https://issues.apache.org/jira/browse/OOZIE-1911
             Project: Oozie
          Issue Type: Bug
    Affects Versions: trunk
            Reporter: Mona Chitnis
            Assignee: Mona Chitnis
             Fix For: trunk


In chronological order:

Server 1:
Job's SLA eventProcessed set to 0101 => Start and End sla processed.

Server 2:
Receives above job's status event, processes remaining 'duration' sla. eventProcessed now = 0111, but incremented to 1000 due to
{code}
SLACalculatorMemory.addJobStatus() : 762
if (slaCalc.getEventProcessed() == 7) {
      slaInfo.setEventProcessed(8);
     slaMap.remove(jobId);
}
{code}

Back to Server 1: (doing periodic SLA checks)
{code}
SLACalculatorMemory.updateJobSla() : 483
if ((eventProc & 1) == 0) { // first bit (start-processed) unset
   if (reg.getExpectedStart() != null) {
         if (reg.getExpectedStart().getTime() + jobEventLatency < System.currentTimeMillis()) {
               // goes ahead and enqueues another START_MISS event and DURATION_MET event
{code}

Conclusion, need to fix that check for least significant bit (and next to it) for 'start' and 'duration' to avoid duplicate events



--
This message was sent by Atlassian JIRA
(v6.2#6252)