You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Sunil Govindan (JIRA)" <ji...@apache.org> on 2018/10/16 07:09:00 UTC

[jira] [Commented] (YARN-8879) Kerberos principal is needed when submitting a submarine job

    [ https://issues.apache.org/jira/browse/YARN-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16651242#comment-16651242 ] 

Sunil Govindan commented on YARN-8879:
--------------------------------------

[~yuan_zac] With this, we will double validate that the principal name is not empty. However what will happen when keytab is not there. 

{{kerberosPrincipal.getKeytab()}}

Do we need to validate this as well ?

> Kerberos principal is needed when submitting a submarine job
> ------------------------------------------------------------
>
>                 Key: YARN-8879
>                 URL: https://issues.apache.org/jira/browse/YARN-8879
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Zac Zhou
>            Assignee: Zac Zhou
>            Priority: Major
>         Attachments: YARN-8879.001.patch
>
>
> when I submitted a submarine job like this:
> {code:java}
>  ./yarn jar /home/hadoop/hadoop-current/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar job run \
>  --env DOCKER_JAVA_HOME=/opt/java \
>  --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.1.0 --name distributed-tf-gpu \
>  --env YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK=calico-network \
>  --worker_docker_image 10.120.196.232:5000/gpu-cuda9.0-tf1.8.0-with-models-7 \
>  --input_path hdfs://mldev/tmp/cifar-10-data \
>  --checkpoint_path hdfs://mldev/user/hadoop/tf-distributed-checkpoint \
>  --num_ps 1 \
>  --ps_resources memory=4G,vcores=2,gpu=0 \
>  --ps_launch_cmd "python /test/cifar10_estimator/cifar10_main.py --data-dir=hdfs://mldev/tmp/cifar-10-data --job-dir=hdfs://mldev/tmp/cifar-10-jobdir --num-gpus=0" \
>  --ps_docker_image 10.120.196.232:5000/dockerfile-cpu-tf1.8.0-with-models \
>  --worker_resources memory=4G,vcores=2,gpu=1 --verbose \
>  --num_workers 2 \
>  --worker_launch_cmd "python /test/cifar10_estimator/cifar10_main.py --data-dir=hdfs://mldev/tmp/cifar-10-data --job-dir=hdfs://mldev/tmp/cifar-10-jobdir --train-steps=500 --eval-batch-size=16 --train-batch-size=16 --sync --num-gpus=1"  {code}
>  
> The following error as got:
> {code:java}
> Exception in thread "main" java.lang.IllegalArgumentException: Kerberos principal or keytab is missing.
> at org.apache.hadoop.yarn.service.utils.ServiceApiUtil.validateKerberosPrincipal(ServiceApiUtil.java:255)
> at org.apache.hadoop.yarn.service.utils.ServiceApiUtil.validateAndResolveService(ServiceApiUtil.java:134)
> at org.apache.hadoop.yarn.service.client.ServiceClient.actionCreate(ServiceClient.java:467)
> at org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter.submitJob(YarnServiceJobSubmitter.java:542)
> at org.apache.hadoop.yarn.submarine.client.cli.RunJobCli.run(RunJobCli.java:231)
> at org.apache.hadoop.yarn.submarine.client.cli.Cli.main(Cli.java:94)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:236){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org