You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Zhan Zhang <zh...@gmail.com> on 2014/04/17 22:58:34 UTC

Spark REPL question

Please help, I am knew to both Spark and scala. 

I am trying to figure out how spark distribute the task to workers in REPL.
I only found the place where task is serialized and sent, and workers
deserialize and load the task with the class name by ExecutorClassLoader.
But I didn't find how the driver uploaded the REPL generated .class/jar file
by REPL to file server/hdfs. My understanding is that the worker has to know
the class as well to instantiate the task.

Does anybody know where the code is (file or function name) or my
undertanding is wrong?

Thanks.



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-REPL-question-tp6331.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: Spark REPL question

Posted by Zhan Zhang <zh...@gmail.com>.
Clear to me now.

Thanks.



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-REPL-question-tp6331p6335.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: Spark REPL question

Posted by Michael Armbrust <mi...@databricks.com>.
Yeah, I think that is correct.


On Thu, Apr 17, 2014 at 2:47 PM, Zhan Zhang <zh...@gmail.com> wrote:

> Thanks a lot.
>
> By "spins up", do you mean using the same directory, specified by
> following?
>
>       /** Local directory to save .class files too */
>       val outputDir = {
>         val tmp = System.getProperty("java.io.tmpdir")
>         val rootDir = new SparkConf().get("spark.repl.classdir",  tmp)
>         Utils.createTempDir(rootDir)
>       }
>     val virtualDirectory                              = new
> PlainFile(outputDir) // "directory" for classfiles
>     val classServer                                   = new
> HttpServer(outputDir)     /** Jetty server that will serve our classes to
> worker nodes */
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-REPL-question-tp6331p6333.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>

Re: Spark REPL question

Posted by Zhan Zhang <zh...@gmail.com>.
Thanks a lot.

By "spins up", do you mean using the same directory, specified by following?

      /** Local directory to save .class files too */
      val outputDir = {
        val tmp = System.getProperty("java.io.tmpdir")
        val rootDir = new SparkConf().get("spark.repl.classdir",  tmp)
        Utils.createTempDir(rootDir)
      }
    val virtualDirectory                              = new
PlainFile(outputDir) // "directory" for classfiles
    val classServer                                   = new
HttpServer(outputDir)     /** Jetty server that will serve our classes to
worker nodes */



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-REPL-question-tp6331p6333.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: Spark REPL question

Posted by Michael Armbrust <mi...@databricks.com>.
The REPL spins up an org.apache.spark.HttpServer, which provides classes
that are generated by the REPL as well as jars from addJar.

Michael


On Thu, Apr 17, 2014 at 1:58 PM, Zhan Zhang <zh...@gmail.com> wrote:

> Please help, I am knew to both Spark and scala.
>
> I am trying to figure out how spark distribute the task to workers in REPL.
> I only found the place where task is serialized and sent, and workers
> deserialize and load the task with the class name by ExecutorClassLoader.
> But I didn't find how the driver uploaded the REPL generated .class/jar
> file
> by REPL to file server/hdfs. My understanding is that the worker has to
> know
> the class as well to instantiate the task.
>
> Does anybody know where the code is (file or function name) or my
> undertanding is wrong?
>
> Thanks.
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-REPL-question-tp6331.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>