You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by innowireless TaeYun Kim <ta...@innowireless.co.kr> on 2014/05/13 08:17:46 UTC

Is this supported? : Spark on Windows, Hadoop YARN on Linux.

I'm trying to run spark-shell on Windows that uses Hadoop YARN on Linux.
Specifically, the environment is as follows:

- Client
  - OS: Windows 7
  - Spark version: 1.0.0-SNAPSHOT (git cloned 2014.5.8)
- Server
  - Platform: hortonworks sandbox 2.1

I has to modify the spark source code to apply
https://issues.apache.org/jira/browse/YARN-1824, so that the cross-platform
issues can be addressed. (that is, change $() to $$(), File.pathSeparator to
ApplicationConstants.CLASS_PATH_SEPARATOR).
Seeing this, I suspect that the Spark code for now is not prepared to
support cross-platform submit, that is, Spark on Windows -> Hadoop YARN on
Linux.

Anyways, after the modification and some configuration tweak, at least the
yarn-client mode spark-shell submitted from Windows 7 seems to try to start.
But the ApplicationManager fails to register.
Yarn server log is as follows: ('owner' is the user name of the Windows 7
machine.)

Log Type: stderr
Log Length: 1356
log4j:WARN No appenders could be found for logger
(org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
more info.
14/05/12 01:13:54 INFO YarnSparkHadoopUtil: Using Spark's default log4j
profile: org/apache/spark/log4j-defaults.properties
14/05/12 01:13:54 INFO SecurityManager: Changing view acls to: yarn,owner
14/05/12 01:13:54 INFO SecurityManager: SecurityManager: authentication
disabled; ui acls disabled; users with view permissions: Set(yarn, owner)
14/05/12 01:13:55 INFO Slf4jLogger: Slf4jLogger started
14/05/12 01:13:56 INFO Remoting: Starting remoting
14/05/12 01:13:56 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://sparkYarnAM@sandbox.hortonworks.com:47074]
14/05/12 01:13:56 INFO Remoting: Remoting now listens on addresses:
[akka.tcp://sparkYarnAM@sandbox.hortonworks.com:47074]
14/05/12 01:13:56 INFO RMProxy: Connecting to ResourceManager at
/0.0.0.0:8030
14/05/12 01:13:56 INFO ExecutorLauncher: ApplicationAttemptId:
appattempt_1399856448891_0018_000001
14/05/12 01:13:56 INFO ExecutorLauncher: Registering the ApplicationMaster
14/05/12 01:13:56 WARN Client: Exception encountered while connecting to the
server : org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN]

How can I handle this error?
Or, should I give up and use Linux for my client machine?
(I want to use Windows for client, since for me it's more comfortable to
develop applications.)
BTW, I'm a newbie for Spark and Hadoop.

Thanks in advance.