You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Zac Zhou (JIRA)" <ji...@apache.org> on 2018/08/24 01:48:00 UTC

[jira] [Comment Edited] (YARN-8698) [Submarine] Failed to add hadoop dependencies in docker container when submitting a submarine job

    [ https://issues.apache.org/jira/browse/YARN-8698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591040#comment-16591040 ] 

Zac Zhou edited comment on YARN-8698 at 8/24/18 1:47 AM:
---------------------------------------------------------

Hi [~tangzhankun],

yeah, I specified "DOCKER_HADOOP_HDFS_HOME" to /hadoop-3.1.0 which is the hadoop home directory in docker image.

In docker, DOCKER_HADOOP_HDFS_HOME takes effect,  but it is not enough.

I think you can test it even without docker env.

 when you just specify HADOOP_HDFS_HOME, it works well as follows:
{code:java}
hadoop@hostname:~/zq/submarine-lib/hadoop-3.2.0-SNAPSHOT$ export HADOOP_HDFS_HOME=/home/hadoop/zq/submarine-lib/hadoop-3.2.0-SNAPSHOT
hadoop@hostname:~/zq/submarine-lib/hadoop-3.2.0-SNAPSHOT$ echo $HADOOP_HDFS_HOME
/home/hadoop/zq/submarine-lib/hadoop-3.2.0-SNAPSHOT
hadoop@hostname:~/zq/submarine-lib/hadoop-3.2.0-SNAPSHOT$ ./bin/hadoop classpath --glob
HADOOP_SUBCMD_SUPPORTDAEMONIZATION: false
HADOOP_SUBCMD_SECURESERVICE: false
HADOOP_DAEMON_MODE: default
/home/hadoop/yarn-submarine/etc/hadoop:/home/hadoop/yarn-submarine/share/hadoop/common/lib/jaxb-api-2.2.11.jar:/home/hadoop/yarn-submarine/share/hadoop/common/lib/commons-lang3-3.7.jar:/home/hadoop/yarn-submarine/share/hadoop/common/lib/gson-2.2.4.jar:/home/hadoop/yarn-submarine/share/hadoop/common/lib/paranamer-2.3.jar:
{code}
But, if you specify a wrong  HADOOP_COMMON_HOME with a correct HADOOP_HDFS_HOME, it will fail.
{code:java}
hadoop@hostname:~/zq/submarine-lib/hadoop-3.2.0-SNAPSHOT$ export HADOOP_COMMON_HOME=/home/hadoop
hadoop@hostname:~/zq/submarine-lib/hadoop-3.2.0-SNAPSHOT$ ./bin/hadoop classpath --glob
HADOOP_SUBCMD_SUPPORTDAEMONIZATION: false
HADOOP_SUBCMD_SECURESERVICE: false
HADOOP_DAEMON_MODE: default
Error: Could not find or load main class org.apache.hadoop.util.Classpath
{code}
 

 


was (Author: yuan_zac):
Hi [~tangzhankun],

yeah, I specified "DOCKER_HADOOP_HDFS_HOME" to /hadoop-3.1.0 which is the hadoop home directory in docker image.

In docker, DOCKER_HADOOP_HDFS_HOME takes effect,  but it is not enough.

I think you can test it even without docker env.

 when you just specify HADOOP_HDFS_HOME, it works well as follows:
{code:java}
hadoop@hostname:~/zq/submarine-lib/hadoop-3.2.0-SNAPSHOT$ export HADOOP_HDFS_HOME=/home/hadoop/zq/submarine-lib/hadoop-3.2.0-SNAPSHOT
hadoop@hostname:~/zq/submarine-lib/hadoop-3.2.0-SNAPSHOT$ echo $HADOOP_HDFS_HOME
/home/hadoop/zq/submarine-lib/hadoop-3.2.0-SNAPSHOT
hadoop@hostname:~/zq/submarine-lib/hadoop-3.2.0-SNAPSHOT$ ./bin/hadoop classpath --glob
HADOOP_SUBCMD_SUPPORTDAEMONIZATION: false
HADOOP_SUBCMD_SECURESERVICE: false
HADOOP_DAEMON_MODE: default
/home/hadoop/yarn-submarine/etc/hadoop:/home/hadoop/yarn-submarine/share/hadoop/common/lib/jaxb-api-2.2.11.jar:/home/hadoop/yarn-submarine/share/hadoop/common/lib/commons-lang3-3.7.jar:/home/hadoop/yarn-submarine/share/hadoop/common/lib/gson-2.2.4.jar:/home/hadoop/yarn-submarine/share/hadoop/common/lib/paranamer-2.3.jar:
{code}
But, if you specify a wrong  HADOOP_COMMON_HOME with a correct HADOOP_HDFS_HOME, it failed.
{code:java}
hadoop@hostname:~/zq/submarine-lib/hadoop-3.2.0-SNAPSHOT$ export HADOOP_COMMON_HOME=/home/hadoop
hadoop@hostname:~/zq/submarine-lib/hadoop-3.2.0-SNAPSHOT$ ./bin/hadoop classpath --glob
HADOOP_SUBCMD_SUPPORTDAEMONIZATION: false
HADOOP_SUBCMD_SECURESERVICE: false
HADOOP_DAEMON_MODE: default
Error: Could not find or load main class org.apache.hadoop.util.Classpath
{code}
 

 

> [Submarine] Failed to add hadoop dependencies in docker container when submitting a submarine job
> -------------------------------------------------------------------------------------------------
>
>                 Key: YARN-8698
>                 URL: https://issues.apache.org/jira/browse/YARN-8698
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Zac Zhou
>            Assignee: Zac Zhou
>            Priority: Major
>         Attachments: YARN-8698.001.patch
>
>
> When a standalone submarine tf job is submitted, the following error is got :
> INFO:tensorflow:image after unit resnet/tower_0/fully_connected/: (?, 11)
>  INFO:tensorflow:Done calling model_fn.
>  INFO:tensorflow:Create CheckpointSaverHook.
>  hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, kerbTicketCachePath=(NULL), userNa
>  me=(NULL)) error:
>  (unable to get root cause for java.lang.NoClassDefFoundError)
>  (unable to get stack trace for java.lang.NoClassDefFoundError)
>  hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, kerbTicketCachePath=(NULL), userNa
>  me=(NULL)) error:
>  (unable to get root cause for java.lang.NoClassDefFoundError)
>  (unable to get stack trace for java.lang.NoClassDefFoundError)
>  
> This error may be related to hadoop classpath
> Hadoop env variables of launch_container.sh are as follows:
> export HADOOP_COMMON_HOME=${HADOOP_COMMON_HOME:-"/home/hadoop/yarn-submarine"}
>  export HADOOP_HDFS_HOME=${HADOOP_HDFS_HOME:-"/home/hadoop/yarn-submarine"}
>  export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/home/hadoop/yarn-submarine/conf"}
>  export HADOOP_YARN_HOME=${HADOOP_YARN_HOME:-"/home/hadoop/yarn-submarine"}
>  export HADOOP_HOME=${HADOOP_HOME:-"/home/hadoop/yarn-submarine"}
>  
> run-PRIMARY_WORKER.sh is like:
> export HADOOP_YARN_HOME=
>  export HADOOP_HDFS_HOME=/hadoop-3.1.0
>  export HADOOP_CONF_DIR=$WORK_DIR
>  
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org