You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by "Sergiy Matusevych (JIRA)" <ji...@apache.org> on 2017/05/15 22:12:04 UTC

[jira] [Issue Comment Deleted] (REEF-1796) All REEF YARN jobs end with FORCE_CLOSED status on the client side

     [ https://issues.apache.org/jira/browse/REEF-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergiy Matusevych updated REEF-1796:
------------------------------------
    Comment: was deleted

(was: Looks like we have a Driver setup issue. I've just noticed the following message in HelloREEF Driver log on YARN:
{code}
2017-05-12 14:14:30,346 FINE reef.runtime.common.driver.client.RemoteClientJobStatusHandler.<init> main | Instantiated 'RemoteClientJobStatusHandler' without an actual connection to the client.
{code}
So it's not that we close the connection prematurely - it is likely we don't have it at all :))

> All REEF YARN jobs end with FORCE_CLOSED status on the client side
> ------------------------------------------------------------------
>
>                 Key: REEF-1796
>                 URL: https://issues.apache.org/jira/browse/REEF-1796
>             Project: REEF
>          Issue Type: Bug
>            Reporter: Sergiy Matusevych
>            Priority: Critical
>              Labels: bug, yarn
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> It looks like the connection between REEF driver and REEF client is being closed prematurely either on the client or on the driver side when using YARN runtime. As a result, REEF driver fails to communicate its final status to the client, and the client times out with {{FORCE_CLOSED}} status. That causes *all* unit tests to fail on YARN. From the logs it can be seen that REEF jobs complete normally on YARN; it's just the final status message that does not make it to the client.
> To reproduce: run e.g. HelloREEF application on YARN,
> {code}
> .\bin\runreef.ps1 -VerboseLog -Jars .\lang\java\reef-examples\target\reef-examples-0.16.0-SNAPSHOT-shaded.jar -Class org.apache.reef.examples.hello.HelloREEFYarn
> {code}
> or run unt tests
> {code}
> .\bin\runtests.ps1 -Yarn -Jars ".\lang\java\reef-examples\target\reef-examples-0.16.0-SNAPSHOT-shaded.jar;.\lang\java\reef-tests\target\reef-tests-0.16.0-SNAPSHOT-test-jar-with-dependencies.jar"
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)