You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by "Julia (JIRA)" <ji...@apache.org> on 2017/05/12 02:30:04 UTC

[jira] [Updated] (REEF-1797) Driver not shut down when running 1000 nodes on cluster for IMRU Example

     [ https://issues.apache.org/jira/browse/REEF-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julia updated REEF-1797:
------------------------
    Description: 
When I run IMRU Example with 1000 nodes on cluster with the latest master bits, I noticed the driver is not able to be shut down. Looking into detail logs, not all the CompletedEvaluator events are received (missing 1 or 2 in different tests) even if all the CompletedTask events are received and our code has called Dispose() for all the active contexts. 

The test with 1000 nodes on the REEF last Dec bits can be shut down successfully. 

  was:
When I run IMRU Example with 1000 nodes on cluster with the latest master bits, I noticed the driver is not able to be shut down. Looking into detail logs, not all the CompletedEvaluator events are received (missing 1 or 2 in different tests) even if all the CompletedTask events are received and our code has called Dispose() for all the active contexts. 

The test with 100 nodes on the REEF last Dec bits can be shut down successfully. 


> Driver not shut down when running 1000 nodes on cluster for IMRU Example
> ------------------------------------------------------------------------
>
>                 Key: REEF-1797
>                 URL: https://issues.apache.org/jira/browse/REEF-1797
>             Project: REEF
>          Issue Type: Bug
>            Reporter: Julia
>
> When I run IMRU Example with 1000 nodes on cluster with the latest master bits, I noticed the driver is not able to be shut down. Looking into detail logs, not all the CompletedEvaluator events are received (missing 1 or 2 in different tests) even if all the CompletedTask events are received and our code has called Dispose() for all the active contexts. 
> The test with 1000 nodes on the REEF last Dec bits can be shut down successfully. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)