You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Micah Whitacre (JIRA)" <ji...@apache.org> on 2014/08/29 23:31:53 UTC

[jira] [Created] (CRUNCH-466) Occasional Spark Test failures due to Future Timeouts

Micah Whitacre created CRUNCH-466:
-------------------------------------

             Summary: Occasional Spark Test failures due to Future Timeouts
                 Key: CRUNCH-466
                 URL: https://issues.apache.org/jira/browse/CRUNCH-466
             Project: Crunch
          Issue Type: Bug
          Components: Core
            Reporter: Micah Whitacre
            Assignee: Josh Wills


When building master and the 0.11 RC on one devices I started getting sporadic test failures.  The test that failed changed between runs.  The error seems to be related to Spark starting up for testing vs anything wrong with our code.

Here is an example of one of the failures...
{quote}
14/08/29 16:16:17 INFO Remoting: Starting remoting
14/08/29 16:16:27 ERROR Remoting: Remoting error: [Startup timed out] [
akka.remote.RemoteTransportException: Startup timed out
	at akka.remote.Remoting.akka$remote$Remoting$$notifyError(Remoting.scala:129)
	at akka.remote.Remoting.start(Remoting.scala:191)
	at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
	at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:579)
	at akka.actor.ActorSystemImpl._start(ActorSystem.scala:577)
	at akka.actor.ActorSystemImpl.start(ActorSystem.scala:588)
	at akka.actor.ActorSystem$.apply(ActorSystem.scala:111)
	at akka.actor.ActorSystem$.apply(ActorSystem.scala:104)
	at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:104)
	at org.apache.spark.SparkEnv$.create(SparkEnv.scala:152)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:202)
	at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:53)
	at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:67)
	at org.apache.crunch.impl.spark.SparkPipeline.runAsync(SparkPipeline.java:137)
	at org.apache.crunch.impl.spark.SparkPipeline.run(SparkPipeline.java:110)
	at org.apache.crunch.materialize.MaterializableIterable.iterator(MaterializableIterable.java:94)
	at com.google.common.collect.Lists.newArrayList(Lists.java:125)
	at org.apache.crunch.SparkAggregatorIT.testCount(SparkAggregatorIT.java:43)
{quote}

If we changed the tests to specify a SparkConf we should be able to increase the akka.actor.timeout to be longer.  I also saw a few posts about Akka having trouble if it spins up a lot of actors.  I haven't looked into Spark's testing framework but maybe if we could consolidate startup/shutdown to the beginning or end of a suite it might help.



--
This message was sent by Atlassian JIRA
(v6.2#6252)