You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by pcsenthil <pc...@gmail.com> on 2014/09/02 16:02:48 UTC

Spark Java Configuration.

Team,

I am new to Apache Spark and I didn't have much knowledge on hadoop or big
data. I need clarifications on the below,

How does Spark Configuration works, from a tutorial i got the below 

/SparkConf conf = new SparkConf().setAppName("Simple application")
                            .setMaster("local[4]"); 	
    JavaSparkContext java_SC = new JavaSparkContext(conf);/

from this, i understood that we are providing the config through java
program to Spark.
Let us assume i have written this in a separate java method.

My question are

what happen if i am keep on calling this?
If this one will will keep on creating new objects for spark on each call,
then how we are going to handle the JVM memory? Since under each object i am
trying to run 4 concurrent threads?
Is there any option to find existing one in JVM, so instead of creating new
Spark object i can go with it?

Please help me on this.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Java-Configuration-tp13269.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: Spark Java Configuration.

Posted by Yana Kadiyska <ya...@gmail.com>.

JavaSparkContext java_SC = new JavaSparkContext(conf); is the spark
context. An application has a single spark context -- you won't be able to
"keep calling this" -- you'll see an error if you try to create a second
such object from the same application.

Additionally, depending on your configuration, if you create a few
different apps that each create a spark context, you'll see them all
connected to the master in the UI. But they'll have to share executors on
the worker machines you have available. You'll often see messages like "No
resources available" if you are trying to run more than 1 app concurrently
and the first app you start is "resource greedy"

Hope this helps.

On Tue, Sep 2, 2014 at 10:02 AM, pcsenthil <pc...@gmail.com> wrote:

> Team,
>
> I am new to Apache Spark and I didn't have much knowledge on hadoop or big
> data. I need clarifications on the below,
>
> How does Spark Configuration works, from a tutorial i got the below
>
> /SparkConf conf = new SparkConf().setAppName("Simple application")
>                             .setMaster("local[4]");
>     JavaSparkContext java_SC = new JavaSparkContext(conf);/
>
> from this, i understood that we are providing the config through java
> program to Spark.
> Let us assume i have written this in a separate java method.
>
> My question are
>
> what happen if i am keep on calling this?
> If this one will will keep on creating new objects for spark on each call,
> then how we are going to handle the JVM memory? Since under each object i
> am
> trying to run 4 concurrent threads?
> Is there any option to find existing one in JVM, so instead of creating new
> Spark object i can go with it?
>
> Please help me on this.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Java-Configuration-tp13269.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>