You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Mona Chitnis (JIRA)" <ji...@apache.org> on 2014/08/28 23:43:08 UTC
[jira] [Created] (OOZIE-1984) SLACalculator in HA mode performs
duplicate operations on records with completed jobs
Mona Chitnis created OOZIE-1984:
-----------------------------------
Summary: SLACalculator in HA mode performs duplicate operations on records with completed jobs
Key: OOZIE-1984
URL: https://issues.apache.org/jira/browse/OOZIE-1984
Project: Oozie
Issue Type: Bug
Affects Versions: trunk
Reporter: Mona Chitnis
Fix For: trunk, 4.1.0
Scenario:
SLA periodic run has already processed start,duration and end for a job's sla entry. But job notification for that job came after this, and triggers the sla listener.
Buggy part:
{code}
SLACalculatorMemory.java
else if (Services.get().get(JobsConcurrencyService.class).isHighlyAvailableMode()) {
// jobid might not exist in slaMap in HA Setting
SLARegistrationBean slaRegBean = SLARegistrationQueryExecutor.getInstance().get(
SLARegQuery.GET_SLA_REG_ALL, jobId);
if (slaRegBean != null) { // filter out jobs picked by SLA job event listener
// but not actually configured for SLA
SLASummaryBean slaSummaryBean = SLASummaryQueryExecutor.getInstance().get(
SLASummaryQuery.GET_SLA_SUMMARY, jobId);
slaCalc = new SLACalcStatus(slaSummaryBean, slaRegBean);
if (slaCalc.getEventProcessed() < 7) {
slaMap.put(jobId, slaCalc);
}
}
}
}
if (slaCalc != null) {
..
Object eventProcObj = ((SLASummaryQueryExecutor) SLASummaryQueryExecutor.getInstance())
.getSingleValue(SLASummaryQuery.GET_SLA_SUMMARY_EVENTPROCESSED, jobId);
byte eventProc = ((Byte) eventProcObj).byteValue();
..
processJobEndSuccessSLA(slaCalc, startTime, endTime);
{code}
method processJobEndSuccesSLA goes ahead and checks second LSB bit of eventProc and sends duration event _again_. So the bug here is two-fold:
* if all events are already processed, still invokes this function
* event processed is 8 (1000), so second LSB bit is unset and hence duration processed.
Fix - not invoke function when eventProc = 1000
--
This message was sent by Atlassian JIRA
(v6.2#6252)