You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by "Sergiy Matusevych (JIRA)" <ji...@apache.org> on 2016/10/18 22:46:58 UTC

[jira] [Comment Edited] (REEF-1482) IMRU driver does not exit even if all the task exit normally

    [ https://issues.apache.org/jira/browse/REEF-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15586869#comment-15586869 ] 

Sergiy Matusevych edited comment on REEF-1482 at 10/18/16 10:46 PM:
--------------------------------------------------------------------

I think we have to build the simplest reproducible test for this issue. One thing that I would suggest is to try Java-only application with minimal functionality, e.g. {{org.apache.reef.examples.pool.Launch}} application from {{reef-examples}}. All that application does is launching the specified number of tasks that just sleep for a given number of seconds and then quit. e.g. the following command

{code}
.\bin\runreef.ps1 -Jars C:\Users\sergiym.REDMOND\devel\reef\lang\java\reef-tests\target\reef-tests-0.16.0-SNAPSHOT-test-jar-with-dependencies.jar -Class org.apache.reef.examples.pool.Launch -evaluators 4 -tasks 16 -delay 5 -id 100 -local true
{code}

will request 4 evaluators, and submit 16 tasks to them (the driver will submit a task to the evaluator as soon as it is done with the previous one). Each task here simply sleeps for 5 seconds and quits. {{-local true}} runs the job locally; use {{-local false}} to submit it to YARN. {{-id 100}} is simply a numeric ID that helps to distinguish between different experiments.


was (Author: motus):
I think we have to build the simplest reproducible test for this issue. One thing that I would suggest is to try Java-only application with minimal functionality, e.g. {{org.apache.reef.examples.pool.Launch}} application from {{reef-examples}}. All that application dies is launch the specified number of tasks that just sleep for a given number of seconds and then quit. e.g. the following command

{code}
.\bin\runreef.ps1 -Jars C:\Users\sergiym.REDMOND\devel\reef\lang\java\reef-tests\target\reef-tests-0.16.0-SNAPSHOT-test-jar-with-dependencies.jar -Class org.apache.reef.examples.pool.Launch -evaluators 4 -tasks 16 -delay 5 -id 100 -local true
{code}

will request 4 evaluators, and submit 16 tasks to them (the driver will submit task to an evaluator as soon as it is done with the previous one). each task simply sleeps for 5 seconds and quits. {{-local true}} runs the job locally; use {{-local false}} to submit it to YARN. {{-id 100}} is simply a numeric ID that helps to distinguish between different experiments.

> IMRU driver does not exit even if all the task exit normally
> ------------------------------------------------------------
>
>                 Key: REEF-1482
>                 URL: https://issues.apache.org/jira/browse/REEF-1482
>             Project: REEF
>          Issue Type: Bug
>          Components: REEF.NET
>         Environment: C#
>            Reporter: Dhruv Mahajan
>
> Recently, upon running IMRU with large number of mappers, it is observed intermittently that IMRU driver does exit while all other tasks (map and update) exit normally without any issues. 
> The aim of this JIRA is to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)