You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2015/07/24 16:10:05 UTC

[jira] [Commented] (OOZIE-2314) Unable to kill old instance child job by workflow or coord rerun by Launcher

    [ https://issues.apache.org/jira/browse/OOZIE-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640485#comment-14640485 ] 

Hadoop QA commented on OOZIE-2314:
----------------------------------

Testing JIRA OOZIE-2314

Cleaning local git workspace

----------------------------

{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:green}+1 RAW_PATCH_ANALYSIS{color}
.    {color:green}+1{color} the patch does not introduce any @author tags
.    {color:green}+1{color} the patch does not introduce any tabs
.    {color:green}+1{color} the patch does not introduce any trailing spaces
.    {color:green}+1{color} the patch does not introduce any line longer than 132
.    {color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.    {color:green}+1{color} the patch does not seem to introduce new RAT warnings
{color:green}+1 JAVADOC{color}
.    {color:green}+1{color} the patch does not seem to introduce new Javadoc warnings
{color:green}+1 COMPILE{color}
.    {color:green}+1{color} HEAD compiles
.    {color:green}+1{color} patch compiles
.    {color:green}+1{color} the patch does not seem to introduce new javac warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.    {color:green}+1{color} the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations
.    {color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.    Tests run: 1682
.    Tests failed: 2
.    Tests errors: 0

.    The patch failed the following testcases:

.      testAdminInstrumentation(org.apache.oozie.client.TestOozieCLI)
.      testStreamingWithMultipleOozieServers_coordActionList(org.apache.oozie.service.TestZKXLogStreamingService)

{color:green}+1 DISTRO{color}
.    {color:green}+1{color} distro tarball builds with the patch 

----------------------------
{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/2457/

> Unable to kill old instance child job by workflow or coord rerun by Launcher
> ----------------------------------------------------------------------------
>
>                 Key: OOZIE-2314
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2314
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Jaydeep Vishwakarma
>            Assignee: Jaydeep Vishwakarma
>            Priority: Critical
>         Attachments: OOZIE-2314.patch
>
>
> Oozie launcher kills all the child jobs which, launched by an old instance of same launcher, workflow or coord action to avoid the duplicate child running at same. For same it searches the application ids by tag and time, And it kills all AMs. You can find more detail in OOZIE-2129. 
> It works fine when Launcher attempt gets killed and tries again. In case of Yarn container which contains AM get kills due to some reason and we run workflow/coord action this patch does not work.
>    It happens due to a time filter applied during finding the app ids, which always takes the current time from the server.
>    {{LauncherMapperHelper.java}}
>    {code}
>        public static void setupYarnRestartHandling(JobConf launcherJobConf, Configuration actionConf, String launcherTag)
>                throws NoSuchAlgorithmException {
>            launcherJobConf.setLong(LauncherMainHadoopUtils.OOZIE_JOB_LAUNCH_TIME, System.currentTimeMillis());
>            // Tags are limited to 100 chars so we need to hash them to make sure (the actionId otherwise doesn't have a max length)
>            String tag = getTag(launcherTag);
>            // keeping the oozie.child.mapreduce.job.tags instead of mapreduce.job.tags to avoid killing launcher itself.
>            // mapreduce.job.tags should only go to child job launch by launcher.
>            actionConf.set(LauncherMainHadoopUtils.CHILD_MAPREDUCE_JOB_TAGS, tag);
>        }
>    {code}
> When a user rerun the workflow or coord action, Launcher picks the current system time along with tags, It searches for running application ids and kills them. It eventually does not find any App Id, As the previous instance of the same workflow/coord ran before the new system time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)