You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiangrui Meng (JIRA)" <ji...@apache.org> on 2015/05/14 20:59:59 UTC

[jira] [Commented] (SPARK-7642) Missing 1 worker on standalone clusters.

    [ https://issues.apache.org/jira/browse/SPARK-7642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544196#comment-14544196 ] 

Xiangrui Meng commented on SPARK-7642:
--------------------------------------

It seems that I'm using an old slave node, which doesn't have the right hadoop binaries. So it works in 1.3, but not in 1.4. I don't know why spark-ec2 didn't update.

{code}
15/05/14 18:55:19 INFO executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT]
Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:139)
	at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:231)
	at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
	at org.apache.hadoop.security.Groups.<init>(Groups.java:55)
	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:182)
	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:235)
	at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:249)
	at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:43)
	at org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:250)
	at org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
	... 3 more
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:129)
	... 10 more
Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative()V
	at org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative(Native Method)
	at org.apache.hadoop.security.JniBasedUnixGroupsMapping.<clinit>(JniBasedUnixGroupsMapping.java:49)
	at org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.<init>(JniBasedUnixGroupsMappingWithFallback.java:38)
	... 15 more
{code}

> Missing 1 worker on standalone clusters.
> ----------------------------------------
>
>                 Key: SPARK-7642
>                 URL: https://issues.apache.org/jira/browse/SPARK-7642
>             Project: Spark
>          Issue Type: Bug
>          Components: Deploy, Spark Core
>    Affects Versions: 1.4.0
>            Reporter: Xiangrui Meng
>            Assignee: Xiangrui Meng
>            Priority: Blocker
>         Attachments: 1.3 data distribution.png, 1.4 data distribution.png
>
>
> Saw this weird issue during performance test. I have a 16-node (plus 1 master) standalone cluster on EC2. I saw 16 works on the master page (:8080). When I run a job, in the executor tab I saw 17 executors (including the driver) in 1.3. However, this number becomes 16 (also including the driver) in 1.4. So one worker is missing from the cluster in 1.4.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org