You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "László Bodor (Jira)" <ji...@apache.org> on 2022/08/29 11:52:00 UTC
[jira] [Resolved] (TEZ-4440) When tez app run in yarn fed cluster, may throw NPE
[ https://issues.apache.org/jira/browse/TEZ-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
László Bodor resolved TEZ-4440.
-------------------------------
Resolution: Fixed
> When tez app run in yarn fed cluster, may throw NPE
> ---------------------------------------------------
>
> Key: TEZ-4440
> URL: https://issues.apache.org/jira/browse/TEZ-4440
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: zhengchenyu
> Assignee: zhengchenyu
> Priority: Major
> Fix For: 0.9.3, 0.10.3
>
> Time Spent: 1h
> Remaining Estimate: 0h
>
> For hadoop version before YARN-8933. When tez app is running in yarn fed cluster, getAvailableResources may return null, then throw NPE.
> {code:java}
> 2022-08-03 01:40:12,069 [ERROR] [AMRM Callback Handler Thread] |rm.YarnTaskSchedulerService|: Got Error from RMClient
> java.lang.NullPointerException
> at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.fitsIn(YarnTaskSchedulerService.java:1445)
> at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.preemptIfNeeded(YarnTaskSchedulerService.java:1218)
> at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.getProgress(YarnTaskSchedulerService.java:916)
> at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:428)
> 2022-08-03 01:40:12,075 [ERROR] [AMRM Callback Handler Thread] |yarn.YarnUncaughtExceptionHandler|: Thread Thread[AMRM Callback Handler Thread,5,main] threw an Exception.
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.NullPointerException
> at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:432)
> Caused by: java.lang.NullPointerException
> at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.fitsIn(YarnTaskSchedulerService.java:1445)
> at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.preemptIfNeeded(YarnTaskSchedulerService.java:1218)
> at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.getProgress(YarnTaskSchedulerService.java:916)
> at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:428){code}
> In yarn federatiaon, AMRMProxy connect multi-rm in async way, so AllocateResponse::getAvailableResources may return null, then throw NPE.
> In my PR, I replace Resource.Instance(0,0) to null. Because null may means yarn is busy, return 0 is reasonable.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)