You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by milad bourhani <mi...@gmail.com> on 2016/01/16 15:24:47 UTC

ClassNotFoundException interpreting a Spark job

Hi everyone,

I’m trying to use the Scala interpreter, IMain, to interpret some Scala code that executes a job with Spark:

@Test
public void countToFive() throws ScriptException {
    SparkConf conf = new SparkConf().setAppName("Spark interpreter").setMaster("local[2]");
    SparkContext sc = new SparkContext(conf);
    Settings settings = new Settings();
    ((MutableSettings.BooleanSetting) settings.usejavacp()).value_$eq(true);
    IMain interpreter = new IMain(settings);
    interpreter.setContextClassLoader();
    interpreter.put("sc: org.apache.spark.SparkContext", sc);
    assertEquals(5L, interpreter.eval("sc.parallelize(List(1,2,3,4,5)).map( _ + 1 ).count()"));
}

However the following error shows up:

java.lang.ClassNotFoundException: $line5.$read$$iw$$iw$$anonfun$1

If the SparkContext object is created after this line:
    interpreter.setContextClassLoader();
then the execution succeeds. The fact is that I’d like to create the context once, and then create interpreters from multiple threads on demand later on. This also relates with the fact that there can only be one SparkContext object in the JVM — https://issues.apache.org/jira/browse/SPARK-2243 <https://issues.apache.org/jira/browse/SPARK-2243>.

It looks like the SparkContext cannot serialize the anonymous function (“_ + 1”). I’ve fiddled a lot with this and cannot seem to get through it, can anybody help?

Thank you in advance,
Milad