You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2014/05/21 01:53:38 UTC

[jira] [Commented] (OOZIE-1844) HA - Lock mechanism for CoordMaterializeTriggerService ( may be for other services as well)

    [ https://issues.apache.org/jira/browse/OOZIE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004134#comment-14004134 ] 

Hadoop QA commented on OOZIE-1844:
----------------------------------

Testing JIRA OOZIE-1844

Cleaning local git workspace

----------------------------

{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.    {color:green}+1{color} the patch does not introduce any @author tags
.    {color:green}+1{color} the patch does not introduce any tabs
.    {color:green}+1{color} the patch does not introduce any trailing spaces
.    {color:green}+1{color} the patch does not introduce any line longer than 132
.    {color:red}-1{color} the patch does not add/modify any testcase
{color:green}+1 RAT{color}
.    {color:green}+1{color} the patch does not seem to introduce new RAT warnings
{color:green}+1 JAVADOC{color}
.    {color:green}+1{color} the patch does not seem to introduce new Javadoc warnings
{color:green}+1 COMPILE{color}
.    {color:green}+1{color} HEAD compiles
.    {color:green}+1{color} patch compiles
.    {color:green}+1{color} the patch does not seem to introduce new javac warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.    {color:green}+1{color} the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations
.    {color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.    Tests run: 1447
.    Tests failed: 5
.    Tests errors: 3

.    The patch failed the following testcases:

.      testConcurrencyReachedAndChooseNextEligible(org.apache.oozie.service.TestCallableQueueService)
.      testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration)
.      testBundleEngineKill(org.apache.oozie.servlet.TestV1JobServletBundleEngine)
.      testBundleEngineResume(org.apache.oozie.servlet.TestV1JobServletBundleEngine)
.      testBundleEngineSuspend(org.apache.oozie.servlet.TestV1JobServletBundleEngine)

{color:green}+1 DISTRO{color}
.    {color:green}+1{color} distro tarball builds with the patch 

----------------------------
{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1251/

> HA -  Lock mechanism for CoordMaterializeTriggerService ( may be for other services as well)
> --------------------------------------------------------------------------------------------
>
>                 Key: OOZIE-1844
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1844
>             Project: Oozie
>          Issue Type: Bug
>          Components: HA
>            Reporter: Purshotam Shah
>            Assignee: Purshotam Shah
>         Attachments: OOZIE-1844-V2.patch
>
>
> Currently we check if job id belong to this server by using modulus operation.
> This may not be optimum way to do.
> 1. We are not processing MATERIALIZATION_SYSTEM_LIMIT, each server is only doing half (in case of two servers) processing. We can always double the limit. But as we add  new system, we need to restart whole cluster to increase the limit.
> 2. The job sequence id is shared among wf,coord,bundle. So, we could have a case where coord with odd/even id is more. In that case we are not distribute load. One server will always do more processing.
> 3. We also have different frequency for different coord jobs. Job with 1 min or 5 min frequency will put more load on system. In this approach one particular job will always run in one system and eventually putting more load on one server. 
> May be simple way to optimize is to have a lock mechanism, each CoordMaterializeTriggerService will obtain a lock and materialize coord. If lock is held by other system, then it will wait for other system to release lock. In this way coord jobs will get distributed among servers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)