You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "László Bodor (Jira)" <ji...@apache.org> on 2023/02/13 13:06:00 UTC

[jira] [Updated] (TEZ-4475) VertexStatus is missing in TestLocalMode if DAG finishes too quickly - causing NPE in unit test

     [ https://issues.apache.org/jira/browse/TEZ-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

László Bodor updated TEZ-4475:
------------------------------
    Fix Version/s: 0.10.3

> VertexStatus is missing in TestLocalMode if DAG finishes too quickly - causing NPE in unit test
> -----------------------------------------------------------------------------------------------
>
>                 Key: TEZ-4475
>                 URL: https://issues.apache.org/jira/browse/TEZ-4475
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>             Fix For: 0.10.3
>
>
> this problem is very hard to reproduce, but when I was able to do so, it was like:
> {code}
> 2023-02-13 11:32:16,302 INFO  [DAGAppMaster Thread] app.DAGAppMaster (DAGAppMaster.java:startDAG(2545)) - Running DAG: testMultipleClientsWithoutSession2_useDfs
> ...
> 2023-02-13 11:32:16,406 INFO  [Thread-675] client.DAGClientImpl (DAGClientImpl.java:getVertexStatusInternal(280)) - getVertexStatusInternal for Sleep, dagCompleted: true, in cache: false
> {code}
> in this case, the latter log message was added [here|https://github.com/apache/tez/blob/e3e91a150dad44a9daa3102da04542e2e365203d/tez-api/src/main/java/org/apache/tez/dag/api/client/DAGClientImpl.java#L305] as:
> {code}
>     LOG.info("getVertexStatusInternal for {}, dagCompleted: {}, in cache: {}", vertexName, dagCompleted,
>         cachedVertexStatus.containsKey(vertexName));
> {code}
> so, the dag has already completed, but there were no vertex status updates yet (cache was empty), so unit tests failed with an inconvenient NPE
> this bug was always there, but got exposed by TEZ-4447
> the easiest way to solve this is to simply wait for dag completion by a tez client call which collects vertex status as well, like: waitForCompletionWithStatusUpdates (instead of waitForCompletion)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)