You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Jake Maes (JIRA)" <ji...@apache.org> on 2016/03/30 02:07:25 UTC

[jira] [Created] (SAMZA-922) Host Affinity - Bug in SamzaContainerRequest causes (recoverable) exceptions in YARN

Jake Maes created SAMZA-922:
-------------------------------

             Summary: Host Affinity - Bug in SamzaContainerRequest causes (recoverable) exceptions in YARN
                 Key: SAMZA-922
                 URL: https://issues.apache.org/jira/browse/SAMZA-922
             Project: Samza
          Issue Type: Bug
            Reporter: Jake Maes
            Assignee: Jake Maes
             Fix For: 0.10.1


The constructor for SamzaContainerRequest creates the Yarn container request differently depending on whether there is a preferred host or not. Unfortunately, it looks for preferredHost == null but not preferredHost.equals(ANY_HOST) and ANY_HOST is the string passed when there is no preferred host. 

As a result, the Yarn container request is actually asking for a container on the host name "ANY_HOST" which causes the following exception:

2016-03-29 21:25:53.892 [main] ScriptBasedMapping [WARN] Exception running /OMITTED/sbin/yarn-topology.py ANY_HOST 
java.io.IOException: Cannot run program "/OMITTED/application_1452292535523_0047/container_1452292535523_0047_02_000001"): error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1042)
	at org.apache.hadoop.util.Shell.runCommand(Shell.java:485)
	at org.apache.hadoop.util.Shell.run(Shell.java:455)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
	at org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.runResolveCommand(ScriptBasedMapping.java:251)
	at org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.resolve(ScriptBasedMapping.java:188)
	at org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitchMapping.java:119)
	at org.apache.hadoop.yarn.util.RackResolver.coreResolve(RackResolver.java:101)
	at org.apache.hadoop.yarn.util.RackResolver.resolve(RackResolver.java:95)
	at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.resolveRacks(AMRMClientImpl.java:551)
	at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.addContainerRequest(AMRMClientImpl.java:411)
	at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.addContainerRequest(AMRMClientAsyncImpl.java:166)
	at org.apache.samza.job.yarn.ContainerRequestState.updateRequestState(ContainerRequestState.java:82)
	at org.apache.samza.job.yarn.AbstractContainerAllocator.requestContainer(AbstractContainerAllocator.java:102)
	at org.apache.samza.job.yarn.AbstractContainerAllocator.requestContainers(AbstractContainerAllocator.java:85)
	at org.apache.samza.job.yarn.SamzaTaskManager.onInit(SamzaTaskManager.java:112)
	at org.apache.samza.job.yarn.SamzaAppMaster$$anonfun$run$1.apply(SamzaAppMaster.scala:117)
	at org.apache.samza.job.yarn.SamzaAppMaster$$anonfun$run$1.apply(SamzaAppMaster.scala:117)
	at scala.collection.immutable.List.foreach(List.scala:318)
	at org.apache.samza.job.yarn.SamzaAppMaster$.run(SamzaAppMaster.scala:117)
	at org.apache.samza.job.yarn.SamzaAppMaster$.main(SamzaAppMaster.scala:104)
	at org.apache.samza.job.yarn.SamzaAppMaster.main(SamzaAppMaster.scala)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:134)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1023)

The exception is recoverable when relaxed locality = true because Yarn just defaults to a random host on the default rack, which was the desired result of the ANY_HOST request. However the behavior is incorrect and the stack traces tend to fill the log.

The string "ANY_HOST" is internal to Samza and Yarn should never see it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)