You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by John Lilley <jo...@redpoint.net> on 2013/11/13 18:00:43 UTC

Issue with CLASSPATH environment wildcard expansion and application master

We are trying to get the appropriate jars into our AMs CLASSPATH, but running into an issue.  We are building on the distributed shell sample code.  Feel free to direct me to "the right way to do this", if our approach is incorrect or the best practice has been revised.  All we need are the default Hadoop jars plus our AM's jar.

I am running HDP 2.2.0.2.0.6.0-76. I am developing a YARN application that builds on the distributed shell example.

The code for constructing the classpath is derived from the distributed shell example:
    Map<String, String> env = new HashMap<String, String>();
    // Add AppMaster.jar location to classpath
    // At some point we should not be required to add
    // the hadoop specific classpaths to the env.
    // It should be provided out of the box.
    // For now setting all required classpaths including
    // the classpath to "." for the application jar
    StringBuilder classPathEnv = new StringBuilder("${CLASSPATH}:./*");
    for (String c : conf.getStrings(
        YarnConfiguration.YARN_APPLICATION_CLASSPATH,
        YarnConfiguration.DEFAULT_YARN_APPLICATION_CLASSPATH)) {
      classPathEnv.append(':');
      classPathEnv.append(c.trim());
    }
    classPathEnv.append(":./log4j.properties");

    env.put("CLASSPATH", classPathEnv.toString());

    amContainer.setEnvironment(env);

It produces a string that looks something like this:
"$HADOOP_COMMON_HOME/share/hadoop/common/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/*: ..."

When I submit the application on a single node cluster, the classpath, as given by system property "java.class.path", in the Application Master has all wild card expansion done and produces a very long classpath. This classpath is correct and the Application Master runs properly.
When I submit the same application to a 4 node cluster running the same version of HDP, then "java.class.path" shows "*" characters which have not been expanded to be the list of jar files in the named directory. Thus, I get "class not found" exceptions.
On the 4 node cluster the value of "yarn.application.classpath" appears "as is" in "java.class.path" with no wild card expansion. Yet, in the single node cluster the value for "yarn.application.classpath" appears in "java.class.path" with all wild card expansion done.
Is there perhaps a problem in our 4 node cluster configuration? Or is there possibly a bug in the YARN implementation for this setup?

Thanks
John