You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by Andrey Meleshko <an...@microsoft.com> on 2016/11/02 16:02:24 UTC

YarnCalsspathProvider initialization: cluster problem?

We are facing what looks like cluster configuration problem
and I am hoping someone on the list can help with next investigation steps.

We see following error in driver logs: SEVERE: YarnConfiguration.YARN_APPLICATION_CLASSPATH is empty. This indicates a broken cluster configuration

This is coming from YarnClasspathProvider initialization code, when the property value is not present.
We've checked HADOOP_HOME variable and yarn-site.xml content: both seem to be correct.
Anyone knows how this property is initialized and what to check next in the cluster?

Related questions:

1)  DEFAULT_YARN_CROSS_PLATFORM_APPLICATION_CLASSPATH usage:
I can see that default classpath property is set to some value, but it is used only if YARN_APPLICATION_CLASSPATH is present, and ignored otherwise.
Would it make sense to use default classpath in any case, if it's present?

YarnConfiguration.DEFAULT_YARN_CROSS_PLATFORM_APPLICATION_CLASSPATH is [{{HADOOP_CONF_DIR}}|{{HADOOP_COMMON_HOME}}/share/hadoop/common/*|{{HADOOP_COMMON_HOME}}/share/hadoop/common/lib/*|{{HADOOP_HDFS_HOME}}/share/hadoop/hdfs/*|{{HADOOP_HDFS_HOME}}/share/hadoop/hdfs/lib/*|{{HADOOP_YARN_HOME}}/share/hadoop/yarn/*|{{HADOOP_YARN_HOME}}/share/hadoop/yarn/lib/*]


2)  DEFAULT_YARN_APPLICATION_CLASSPATH usage:
This property is not used at all in YarnClasspathProvider.
I am not sure if it's initialized in my case, but it sounds like something YarnClasspathProvider should fallback on if the YARN_APPICATION_CLASSPATH is not set.
Thank you,
Andrey

Re: YarnCalsspathProvider initialization: cluster problem?

Posted by Markus Weimer <ma...@weimo.de>.
On Wed, Nov 2, 2016 at 9:02 AM, Andrey Meleshko <an...@microsoft.com> wrote:
> We see following error in driver logs: SEVERE: YarnConfiguration.YARN_APPLICATION_CLASSPATH is empty. This indicates a broken cluster configuration

YARN has changed the way the classpath is assembled. It used to be
based on environment variables (`HADOOP_HOME` and friends). Now, the
classpath to be used by applications is provided as part of the YARN
configuration.

REEF supports both mechanisms, and differentiates which one to use
based on the presence of the field
`YarnConfiguration.YARN_APPLICATION_CLASSPATH`. That is, whether or
not the YARN class `YarnConfiguration` has a `public static final
string YARN_APPLICATION_CLASSPATH`. If it is present, we can assume to
be on a current version of YARN and rely on the configuration stored
under this key for the Driver and Evaluator classpath. If it is not
present, we construct a legacy class path based on the environment
variables.

You get the above warning in the case where the version of YARN used
is new enough to have `YarnConfiguration.YARN_APPLICATION_CLASSPATH`,
but it isn't set. That seems to indicate a broken cluster
configuration, as one would expect clusters that use the current
version of YARN to set that property correctly.

> This is coming from YarnClasspathProvider initialization code, when the property value is not present.
> We've checked HADOOP_HOME variable and yarn-site.xml content: both seem to be correct.

So, `yarn-site.xml` contains the property for the `YARN_APPLICATION_CLASSPATH`?

> 1)  DEFAULT_YARN_CROSS_PLATFORM_APPLICATION_CLASSPATH usage:
> I can see that default classpath property is set to some value, but it is used only if YARN_APPLICATION_CLASSPATH is present, and ignored otherwise.
> Would it make sense to use default classpath in any case, if it's present?

Tricky question. If the cluster is updated to a version of YARN which
has that property, it is not guaranteed that the
`DEFAULT_YARN_CROSS_PLATFORM_APPLICATION_CLASSPATH` works. You can try
on your cluster, though.

> 2)  DEFAULT_YARN_APPLICATION_CLASSPATH usage:
> This property is not used at all in YarnClasspathProvider.
> I am not sure if it's initialized in my case, but it sounds like something YarnClasspathProvider should fallback on if the YARN_APPICATION_CLASSPATH is not set.

True. The tricky thing is that that classpath might not work across
different OSs Windows has a different idea on how to make lists than
other OSs :(. Hence, we opted for the cross platform defaults.

BTW: Neither of the defaults are actually set. They are hard-coded in YARN.

Markus