You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Gourav Sengupta <go...@gmail.com> on 2016/02/27 01:39:51 UTC

Starting SPARK application in cluster mode from an IDE

Hi,

The problem description is mentioned below - why should I be able to create
the SPARK application using Python and not SCALA (using an IDE like
IntelliJ or Eclipse)

SPARK Environment:
-----------------------------
SPARK Version: 1.6.0
OS: MAC OS X 10.11.3
IDE:  IntelliJ


Created a SBT project in IntelliJ using the details in this page:
---------------------------------------------------------------------------------
http://spark.apache.org/docs/latest/quick-start.html


The following code in SCALA fails to create an application in locally
running SPARK cluster (set by running ./sbin/start-master.sh and
./sbin/start-slaves.sh):

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf


object test {
  def main(args: Array[String]) {
    //the below line returns nothing
    println(SparkContext.jarOfClass(this.getClass).toString())
    val logFile = "/tmp/README.md" // Should be some file on your system
    val conf = new
SparkConf().setAppName("IdeaProjects").setMaster("spark://systemhostname:7077")
    //val conf = new
SparkConf().setAppName("IdeaProjects").setMaster("local[*]")
    val sc = new SparkContext(conf)
    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
  }
}


The following code runs fine
--------------------------------------

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf


object test {
  def main(args: Array[String]) {
    //the below line returns nothing
    println(SparkContext.jarOfClass(this.getClass).toString())
    val logFile = "/tmp/README.md" // Should be some file on your system
    //val conf = new
SparkConf().setAppName("IdeaProjects").setMaster("spark://systemhostname:7077")
    val conf = new SparkConf().setAppName("IdeaProjects").setMaster("local[*]")
    val sc = new SparkContext(conf)
    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
  }
}


But creating an application using Python works quite fine, as the following
code runs fine:
-----------------------------------------------------------------------------------------------------------------------
from pyspark import SparkConf, SparkContext
conf =
SparkConf().setMaster("spark://systemhostname:7077").setAppName("test").set("spark.executor.memory",
"1g").set("spark.executor.cores", "2")
conf.getAll()
sc = SparkContext(conf = conf)


Further description and links to this issue is mentioned here:
http://stackoverflow.com/questions/33222045/classnotfoundexception-anonfun-when-deploy-scala-code-to-spark


Thanks and Regards,
Gourav Sengupta

Fwd: Starting SPARK application in cluster mode from an IDE

Posted by Gourav Sengupta <go...@gmail.com>.

Hi,

I will be grateful if someone could kindly respond back to this query.


Thanks and Regards,
Gourav Sengupta

---------- Forwarded message ----------
From: Gourav Sengupta <go...@gmail.com>
Date: Sat, Feb 27, 2016 at 12:39 AM
Subject: Starting SPARK application in cluster mode from an IDE
To: user <us...@spark.apache.org>


Hi,

The problem description is mentioned below - why should I be able to create
the SPARK application using Python and not SCALA (using an IDE like
IntelliJ or Eclipse)

SPARK Environment:
-----------------------------
SPARK Version: 1.6.0
OS: MAC OS X 10.11.3
IDE:  IntelliJ


Created a SBT project in IntelliJ using the details in this page:
---------------------------------------------------------------------------------
http://spark.apache.org/docs/latest/quick-start.html


The following code in SCALA fails to create an application in locally
running SPARK cluster (set by running ./sbin/start-master.sh and
./sbin/start-slaves.sh):

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf


object test {
  def main(args: Array[String]) {
    //the below line returns nothing
    println(SparkContext.jarOfClass(this.getClass).toString())
    val logFile = "/tmp/README.md" // Should be some file on your system
    val conf = new
SparkConf().setAppName("IdeaProjects").setMaster("spark://systemhostname:7077")
    //val conf = new
SparkConf().setAppName("IdeaProjects").setMaster("local[*]")
    val sc = new SparkContext(conf)
    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
  }
}


The following code runs fine
--------------------------------------

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf


object test {
  def main(args: Array[String]) {
    //the below line returns nothing
    println(SparkContext.jarOfClass(this.getClass).toString())
    val logFile = "/tmp/README.md" // Should be some file on your system
    //val conf = new
SparkConf().setAppName("IdeaProjects").setMaster("spark://systemhostname:7077")
    val conf = new SparkConf().setAppName("IdeaProjects").setMaster("local[*]")
    val sc = new SparkContext(conf)
    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
  }
}


But creating an application using Python works quite fine, as the following
code runs fine:
-----------------------------------------------------------------------------------------------------------------------
from pyspark import SparkConf, SparkContext
conf =
SparkConf().setMaster("spark://systemhostname:7077").setAppName("test").set("spark.executor.memory",
"1g").set("spark.executor.cores", "2")
conf.getAll()
sc = SparkContext(conf = conf)


Further description and links to this issue is mentioned here:
http://stackoverflow.com/questions/33222045/classnotfoundexception-anonfun-when-deploy-scala-code-to-spark


Thanks and Regards,
Gourav Sengupta