You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Anton Brazhnyk <an...@genesys.com> on 2014/12/18 03:23:56 UTC

SPARK-2243 Support multiple SparkContexts in the same JVM

Greetings,

First comment on the issue says that reason for non-supporting of multiple contexts is
"There are numerous assumptions in the code base that uses a shared cache or thread local variables or some global identifiers
which prevent us from using multiple SparkContext's."

May it be worked around by creating those context in several classloaders with their own copies of Spark classes?

Thanks,
Anton

Re: SPARK-2243 Support multiple SparkContexts in the same JVM

Posted by Marcelo Vanzin <va...@cloudera.com>.
Hi Anton,

That could solve some of the issues (I've played with that a little
bit). But there are still some areas where this would be sub-optimal,
because Spark still uses system properties in some places and those
are global, not per-class loader.

(SparkSubmit is the biggest offender here, but if you're doing
multiple contexts in the same VM you're probably not using
SparkSubmit. The rest of the code is a lot better but I wouldn't count
on it being 100% safe.)


On Wed, Dec 17, 2014 at 6:23 PM, Anton Brazhnyk
<an...@genesys.com> wrote:
> Greetings,
>
>
>
> First comment on the issue says that reason for non-supporting of multiple
> contexts is
> “There are numerous assumptions in the code base that uses a shared cache or
> thread local variables or some global identifiers
> which prevent us from using multiple SparkContext's.”
>
>
>
> May it be worked around by creating those context in several classloaders
> with their own copies of Spark classes?
>
>
>
> Thanks,
>
> Anton



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


RE: SPARK-2243 Support multiple SparkContexts in the same JVM

Posted by Anton Brazhnyk <an...@genesys.com>.
Well, that's actually what I need (one simple app, several contexts, similar to what JobServer does) and I'm just looking for some workaround here. Classloaders look a little easier for me than spawning my own processes.
Being more specific, I just need to be able to execute arbitrary Spark jobs from long lived web-application with no prior knowledge of those jobs, so I need to accept jars with those jobs (again, like JobServer).

As far as I understand I can't load jars to SparkContext which has spawned executors on the cluster. I need to create new one to load new jars. Am I right on this?


-----Original Message-----
From: Sean Owen [mailto:sowen@cloudera.com] 
Sent: Thursday, December 18, 2014 2:04 AM
To: Anton Brazhnyk
Cc: user@spark.apache.org
Subject: Re: SPARK-2243 Support multiple SparkContexts in the same JVM

Yes, although once you have multiple ClassLoaders, you are operating as if in multiple JVMs for most intents and purposes. I think the request for this kind of functionality comes from use cases where multiple ClassLoaders wouldn't work, like, wanting to have one app (in one ClassLoader) managing multiple contexts.

On Thu, Dec 18, 2014 at 2:23 AM, Anton Brazhnyk <an...@genesys.com> wrote:
> Greetings,
>
>
>
> First comment on the issue says that reason for non-supporting of 
> multiple contexts is “There are numerous assumptions in the code base 
> that uses a shared cache or thread local variables or some global 
> identifiers which prevent us from using multiple SparkContext's.”
>
>
>
> May it be worked around by creating those context in several 
> classloaders with their own copies of Spark classes?
>
>
>
> Thanks,
>
> Anton


Re: SPARK-2243 Support multiple SparkContexts in the same JVM

Posted by Sean Owen <so...@cloudera.com>.
Yes, although once you have multiple ClassLoaders, you are operating
as if in multiple JVMs for most intents and purposes. I think the
request for this kind of functionality comes from use cases where
multiple ClassLoaders wouldn't work, like, wanting to have one app (in
one ClassLoader) managing multiple contexts.

On Thu, Dec 18, 2014 at 2:23 AM, Anton Brazhnyk
<an...@genesys.com> wrote:
> Greetings,
>
>
>
> First comment on the issue says that reason for non-supporting of multiple
> contexts is
> “There are numerous assumptions in the code base that uses a shared cache or
> thread local variables or some global identifiers
> which prevent us from using multiple SparkContext's.”
>
>
>
> May it be worked around by creating those context in several classloaders
> with their own copies of Spark classes?
>
>
>
> Thanks,
>
> Anton

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org