You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jia Zou <ja...@gmail.com> on 2016/01/17 16:29:05 UTC

Reuse Executor JVM across different JobContext

Dear all,

Is there a way to reuse executor JVM across different JobContexts? Thanks.

Best Regards,
Jia

Re: Reuse Executor JVM across different JobContext

Posted by Gene Pang <ge...@gmail.com>.
Yes, you can share RDDs with Tachyon, while keeping the data in memory.
Spark jobs can write to a Tachyon path (tachyon://host:port/path/) and
other jobs can read from the same path.

Here is a presentation that includes that use case:
http://www.slideshare.net/TachyonNexus/tachyon-presentation-at-ampcamp-6-november-2015

Thanks,
Gene

On Sun, Jan 17, 2016 at 1:56 PM, Mark Hamstra <ma...@clearstorydata.com>
wrote:

> Yes, that is one of the basic reasons to use a
> jobserver/shared-SparkContext.  Otherwise, in order share the data in an
> RDD you have to use an external storage system, such as a distributed
> filesystem or Tachyon.
>
> On Sun, Jan 17, 2016 at 1:52 PM, Jia <ja...@gmail.com> wrote:
>
>> Thanks, Mark. Then, I guess JobServer can fundamentally solve my problem,
>> so that jobs can be submitted at different time and still share RDDs.
>>
>> Best Regards,
>> Jia
>>
>>
>> On Jan 17, 2016, at 3:44 PM, Mark Hamstra <ma...@clearstorydata.com>
>> wrote:
>>
>> There is a 1-to-1 relationship between Spark Applications and
>> SparkContexts -- fundamentally, a Spark Applications is a program that
>> creates and uses a SparkContext, and that SparkContext is destroyed when
>> then Application ends.  A jobserver generically and the Spark JobServer
>> specifically is an Application that keeps a SparkContext open for a long
>> time and allows many Jobs to be be submitted and run using that shared
>> SparkContext.
>>
>> More than one Application/SparkContext unavoidably implies more than one
>> JVM process per Worker -- Applications/SparkContexts cannot share JVM
>> processes.
>>
>> On Sun, Jan 17, 2016 at 1:15 PM, Jia <ja...@gmail.com> wrote:
>>
>>> Hi, Mark, sorry for the confusion.
>>>
>>> Let me clarify, when an application is submitted, the master will tell
>>> each Spark worker to spawn an executor JVM process. All the task sets  of
>>> the application will be executed by the executor. After the application
>>> runs to completion. The executor process will be killed.
>>> But I hope that all applications submitted can run in the same executor,
>>> can JobServer do that? If so, it’s really good news!
>>>
>>> Best Regards,
>>> Jia
>>>
>>> On Jan 17, 2016, at 3:09 PM, Mark Hamstra <ma...@clearstorydata.com>
>>> wrote:
>>>
>>> You've still got me confused.  The SparkContext exists at the Driver,
>>> not on an Executor.
>>>
>>> Many Jobs can be run by a SparkContext -- it is a common pattern to use
>>> something like the Spark Jobserver where all Jobs are run through a shared
>>> SparkContext.
>>>
>>> On Sun, Jan 17, 2016 at 12:57 PM, Jia Zou <ja...@gmail.com>
>>> wrote:
>>>
>>>> Hi, Mark, sorry, I mean SparkContext.
>>>> I mean to change Spark into running all submitted jobs (SparkContexts)
>>>> in one executor JVM.
>>>>
>>>> Best Regards,
>>>> Jia
>>>>
>>>> On Sun, Jan 17, 2016 at 2:21 PM, Mark Hamstra <ma...@clearstorydata.com>
>>>> wrote:
>>>>
>>>>> -dev
>>>>>
>>>>> What do you mean by JobContext?  That is a Hadoop mapreduce concept,
>>>>> not Spark.
>>>>>
>>>>> On Sun, Jan 17, 2016 at 7:29 AM, Jia Zou <ja...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Dear all,
>>>>>>
>>>>>> Is there a way to reuse executor JVM across different JobContexts?
>>>>>> Thanks.
>>>>>>
>>>>>> Best Regards,
>>>>>> Jia
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>

Re: Reuse Executor JVM across different JobContext

Posted by Jia <ja...@gmail.com>.
Hi, Praveen, have you checked out this, which might have the details you need:
https://spark-summit.org/2014/wp-content/uploads/2014/07/Spark-Job-Server-Easy-Spark-Job-Management-Chan-Chu.pdf

Best Regards,
Jia


On Jan 19, 2016, at 7:28 AM, praveen S <my...@gmail.com> wrote:

> Can you give me more details on Spark's jobserver.
> 
> Regards, 
> Praveen
> 
> On 18 Jan 2016 03:30, "Jia" <ja...@gmail.com> wrote:
> I guess all jobs submitted through JobServer are executed in the same JVM, so RDDs cached by one job can be visible to all other jobs executed later.
> On Jan 17, 2016, at 3:56 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:
> 
>> Yes, that is one of the basic reasons to use a jobserver/shared-SparkContext.  Otherwise, in order share the data in an RDD you have to use an external storage system, such as a distributed filesystem or Tachyon.
>> 
>> On Sun, Jan 17, 2016 at 1:52 PM, Jia <ja...@gmail.com> wrote:
>> Thanks, Mark. Then, I guess JobServer can fundamentally solve my problem, so that jobs can be submitted at different time and still share RDDs.
>> 
>> Best Regards,
>> Jia
>> 
>> 
>> On Jan 17, 2016, at 3:44 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:
>> 
>>> There is a 1-to-1 relationship between Spark Applications and SparkContexts -- fundamentally, a Spark Applications is a program that creates and uses a SparkContext, and that SparkContext is destroyed when then Application ends.  A jobserver generically and the Spark JobServer specifically is an Application that keeps a SparkContext open for a long time and allows many Jobs to be be submitted and run using that shared SparkContext.
>>> 
>>> More than one Application/SparkContext unavoidably implies more than one JVM process per Worker -- Applications/SparkContexts cannot share JVM processes.  
>>> 
>>> On Sun, Jan 17, 2016 at 1:15 PM, Jia <ja...@gmail.com> wrote:
>>> Hi, Mark, sorry for the confusion.
>>> 
>>> Let me clarify, when an application is submitted, the master will tell each Spark worker to spawn an executor JVM process. All the task sets  of the application will be executed by the executor. After the application runs to completion. The executor process will be killed.
>>> But I hope that all applications submitted can run in the same executor, can JobServer do that? If so, it’s really good news!
>>> 
>>> Best Regards,
>>> Jia
>>> 
>>> On Jan 17, 2016, at 3:09 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:
>>> 
>>>> You've still got me confused.  The SparkContext exists at the Driver, not on an Executor.
>>>> 
>>>> Many Jobs can be run by a SparkContext -- it is a common pattern to use something like the Spark Jobserver where all Jobs are run through a shared SparkContext.
>>>> 
>>>> On Sun, Jan 17, 2016 at 12:57 PM, Jia Zou <ja...@gmail.com> wrote:
>>>> Hi, Mark, sorry, I mean SparkContext.
>>>> I mean to change Spark into running all submitted jobs (SparkContexts) in one executor JVM.
>>>> 
>>>> Best Regards,
>>>> Jia
>>>> 
>>>> On Sun, Jan 17, 2016 at 2:21 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:
>>>> -dev
>>>> 
>>>> What do you mean by JobContext?  That is a Hadoop mapreduce concept, not Spark.
>>>> 
>>>> On Sun, Jan 17, 2016 at 7:29 AM, Jia Zou <ja...@gmail.com> wrote:
>>>> Dear all,
>>>> 
>>>> Is there a way to reuse executor JVM across different JobContexts? Thanks.
>>>> 
>>>> Best Regards,
>>>> Jia
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 


Re: Reuse Executor JVM across different JobContext

Posted by praveen S <my...@gmail.com>.
Can you give me more details on Spark's jobserver.

Regards,
Praveen
On 18 Jan 2016 03:30, "Jia" <ja...@gmail.com> wrote:

> I guess all jobs submitted through JobServer are executed in the same JVM,
> so RDDs cached by one job can be visible to all other jobs executed later.
> On Jan 17, 2016, at 3:56 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:
>
> Yes, that is one of the basic reasons to use a
> jobserver/shared-SparkContext.  Otherwise, in order share the data in an
> RDD you have to use an external storage system, such as a distributed
> filesystem or Tachyon.
>
> On Sun, Jan 17, 2016 at 1:52 PM, Jia <ja...@gmail.com> wrote:
>
>> Thanks, Mark. Then, I guess JobServer can fundamentally solve my problem,
>> so that jobs can be submitted at different time and still share RDDs.
>>
>> Best Regards,
>> Jia
>>
>>
>> On Jan 17, 2016, at 3:44 PM, Mark Hamstra <ma...@clearstorydata.com>
>> wrote:
>>
>> There is a 1-to-1 relationship between Spark Applications and
>> SparkContexts -- fundamentally, a Spark Applications is a program that
>> creates and uses a SparkContext, and that SparkContext is destroyed when
>> then Application ends.  A jobserver generically and the Spark JobServer
>> specifically is an Application that keeps a SparkContext open for a long
>> time and allows many Jobs to be be submitted and run using that shared
>> SparkContext.
>>
>> More than one Application/SparkContext unavoidably implies more than one
>> JVM process per Worker -- Applications/SparkContexts cannot share JVM
>> processes.
>>
>> On Sun, Jan 17, 2016 at 1:15 PM, Jia <ja...@gmail.com> wrote:
>>
>>> Hi, Mark, sorry for the confusion.
>>>
>>> Let me clarify, when an application is submitted, the master will tell
>>> each Spark worker to spawn an executor JVM process. All the task sets  of
>>> the application will be executed by the executor. After the application
>>> runs to completion. The executor process will be killed.
>>> But I hope that all applications submitted can run in the same executor,
>>> can JobServer do that? If so, it’s really good news!
>>>
>>> Best Regards,
>>> Jia
>>>
>>> On Jan 17, 2016, at 3:09 PM, Mark Hamstra <ma...@clearstorydata.com>
>>> wrote:
>>>
>>> You've still got me confused.  The SparkContext exists at the Driver,
>>> not on an Executor.
>>>
>>> Many Jobs can be run by a SparkContext -- it is a common pattern to use
>>> something like the Spark Jobserver where all Jobs are run through a shared
>>> SparkContext.
>>>
>>> On Sun, Jan 17, 2016 at 12:57 PM, Jia Zou <ja...@gmail.com>
>>> wrote:
>>>
>>>> Hi, Mark, sorry, I mean SparkContext.
>>>> I mean to change Spark into running all submitted jobs (SparkContexts)
>>>> in one executor JVM.
>>>>
>>>> Best Regards,
>>>> Jia
>>>>
>>>> On Sun, Jan 17, 2016 at 2:21 PM, Mark Hamstra <ma...@clearstorydata.com>
>>>> wrote:
>>>>
>>>>> -dev
>>>>>
>>>>> What do you mean by JobContext?  That is a Hadoop mapreduce concept,
>>>>> not Spark.
>>>>>
>>>>> On Sun, Jan 17, 2016 at 7:29 AM, Jia Zou <ja...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Dear all,
>>>>>>
>>>>>> Is there a way to reuse executor JVM across different JobContexts?
>>>>>> Thanks.
>>>>>>
>>>>>> Best Regards,
>>>>>> Jia
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>
>

Re: Reuse Executor JVM across different JobContext

Posted by Jia <ja...@gmail.com>.
I guess all jobs submitted through JobServer are executed in the same JVM, so RDDs cached by one job can be visible to all other jobs executed later.
On Jan 17, 2016, at 3:56 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:

> Yes, that is one of the basic reasons to use a jobserver/shared-SparkContext.  Otherwise, in order share the data in an RDD you have to use an external storage system, such as a distributed filesystem or Tachyon.
> 
> On Sun, Jan 17, 2016 at 1:52 PM, Jia <ja...@gmail.com> wrote:
> Thanks, Mark. Then, I guess JobServer can fundamentally solve my problem, so that jobs can be submitted at different time and still share RDDs.
> 
> Best Regards,
> Jia
> 
> 
> On Jan 17, 2016, at 3:44 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:
> 
>> There is a 1-to-1 relationship between Spark Applications and SparkContexts -- fundamentally, a Spark Applications is a program that creates and uses a SparkContext, and that SparkContext is destroyed when then Application ends.  A jobserver generically and the Spark JobServer specifically is an Application that keeps a SparkContext open for a long time and allows many Jobs to be be submitted and run using that shared SparkContext.
>> 
>> More than one Application/SparkContext unavoidably implies more than one JVM process per Worker -- Applications/SparkContexts cannot share JVM processes.  
>> 
>> On Sun, Jan 17, 2016 at 1:15 PM, Jia <ja...@gmail.com> wrote:
>> Hi, Mark, sorry for the confusion.
>> 
>> Let me clarify, when an application is submitted, the master will tell each Spark worker to spawn an executor JVM process. All the task sets  of the application will be executed by the executor. After the application runs to completion. The executor process will be killed.
>> But I hope that all applications submitted can run in the same executor, can JobServer do that? If so, it’s really good news!
>> 
>> Best Regards,
>> Jia
>> 
>> On Jan 17, 2016, at 3:09 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:
>> 
>>> You've still got me confused.  The SparkContext exists at the Driver, not on an Executor.
>>> 
>>> Many Jobs can be run by a SparkContext -- it is a common pattern to use something like the Spark Jobserver where all Jobs are run through a shared SparkContext.
>>> 
>>> On Sun, Jan 17, 2016 at 12:57 PM, Jia Zou <ja...@gmail.com> wrote:
>>> Hi, Mark, sorry, I mean SparkContext.
>>> I mean to change Spark into running all submitted jobs (SparkContexts) in one executor JVM.
>>> 
>>> Best Regards,
>>> Jia
>>> 
>>> On Sun, Jan 17, 2016 at 2:21 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:
>>> -dev
>>> 
>>> What do you mean by JobContext?  That is a Hadoop mapreduce concept, not Spark.
>>> 
>>> On Sun, Jan 17, 2016 at 7:29 AM, Jia Zou <ja...@gmail.com> wrote:
>>> Dear all,
>>> 
>>> Is there a way to reuse executor JVM across different JobContexts? Thanks.
>>> 
>>> Best Regards,
>>> Jia
>>> 
>>> 
>>> 
>> 
>> 
> 
> 


Re: Reuse Executor JVM across different JobContext

Posted by Mark Hamstra <ma...@clearstorydata.com>.
Yes, that is one of the basic reasons to use a
jobserver/shared-SparkContext.  Otherwise, in order share the data in an
RDD you have to use an external storage system, such as a distributed
filesystem or Tachyon.

On Sun, Jan 17, 2016 at 1:52 PM, Jia <ja...@gmail.com> wrote:

> Thanks, Mark. Then, I guess JobServer can fundamentally solve my problem,
> so that jobs can be submitted at different time and still share RDDs.
>
> Best Regards,
> Jia
>
>
> On Jan 17, 2016, at 3:44 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:
>
> There is a 1-to-1 relationship between Spark Applications and
> SparkContexts -- fundamentally, a Spark Applications is a program that
> creates and uses a SparkContext, and that SparkContext is destroyed when
> then Application ends.  A jobserver generically and the Spark JobServer
> specifically is an Application that keeps a SparkContext open for a long
> time and allows many Jobs to be be submitted and run using that shared
> SparkContext.
>
> More than one Application/SparkContext unavoidably implies more than one
> JVM process per Worker -- Applications/SparkContexts cannot share JVM
> processes.
>
> On Sun, Jan 17, 2016 at 1:15 PM, Jia <ja...@gmail.com> wrote:
>
>> Hi, Mark, sorry for the confusion.
>>
>> Let me clarify, when an application is submitted, the master will tell
>> each Spark worker to spawn an executor JVM process. All the task sets  of
>> the application will be executed by the executor. After the application
>> runs to completion. The executor process will be killed.
>> But I hope that all applications submitted can run in the same executor,
>> can JobServer do that? If so, it’s really good news!
>>
>> Best Regards,
>> Jia
>>
>> On Jan 17, 2016, at 3:09 PM, Mark Hamstra <ma...@clearstorydata.com>
>> wrote:
>>
>> You've still got me confused.  The SparkContext exists at the Driver, not
>> on an Executor.
>>
>> Many Jobs can be run by a SparkContext -- it is a common pattern to use
>> something like the Spark Jobserver where all Jobs are run through a shared
>> SparkContext.
>>
>> On Sun, Jan 17, 2016 at 12:57 PM, Jia Zou <ja...@gmail.com>
>> wrote:
>>
>>> Hi, Mark, sorry, I mean SparkContext.
>>> I mean to change Spark into running all submitted jobs (SparkContexts)
>>> in one executor JVM.
>>>
>>> Best Regards,
>>> Jia
>>>
>>> On Sun, Jan 17, 2016 at 2:21 PM, Mark Hamstra <ma...@clearstorydata.com>
>>> wrote:
>>>
>>>> -dev
>>>>
>>>> What do you mean by JobContext?  That is a Hadoop mapreduce concept,
>>>> not Spark.
>>>>
>>>> On Sun, Jan 17, 2016 at 7:29 AM, Jia Zou <ja...@gmail.com>
>>>> wrote:
>>>>
>>>>> Dear all,
>>>>>
>>>>> Is there a way to reuse executor JVM across different JobContexts?
>>>>> Thanks.
>>>>>
>>>>> Best Regards,
>>>>> Jia
>>>>>
>>>>
>>>>
>>>
>>
>>
>
>

Re: Reuse Executor JVM across different JobContext

Posted by Jia <ja...@gmail.com>.
Thanks, Mark. Then, I guess JobServer can fundamentally solve my problem, so that jobs can be submitted at different time and still share RDDs.

Best Regards,
Jia


On Jan 17, 2016, at 3:44 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:

> There is a 1-to-1 relationship between Spark Applications and SparkContexts -- fundamentally, a Spark Applications is a program that creates and uses a SparkContext, and that SparkContext is destroyed when then Application ends.  A jobserver generically and the Spark JobServer specifically is an Application that keeps a SparkContext open for a long time and allows many Jobs to be be submitted and run using that shared SparkContext.
> 
> More than one Application/SparkContext unavoidably implies more than one JVM process per Worker -- Applications/SparkContexts cannot share JVM processes.  
> 
> On Sun, Jan 17, 2016 at 1:15 PM, Jia <ja...@gmail.com> wrote:
> Hi, Mark, sorry for the confusion.
> 
> Let me clarify, when an application is submitted, the master will tell each Spark worker to spawn an executor JVM process. All the task sets  of the application will be executed by the executor. After the application runs to completion. The executor process will be killed.
> But I hope that all applications submitted can run in the same executor, can JobServer do that? If so, it’s really good news!
> 
> Best Regards,
> Jia
> 
> On Jan 17, 2016, at 3:09 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:
> 
>> You've still got me confused.  The SparkContext exists at the Driver, not on an Executor.
>> 
>> Many Jobs can be run by a SparkContext -- it is a common pattern to use something like the Spark Jobserver where all Jobs are run through a shared SparkContext.
>> 
>> On Sun, Jan 17, 2016 at 12:57 PM, Jia Zou <ja...@gmail.com> wrote:
>> Hi, Mark, sorry, I mean SparkContext.
>> I mean to change Spark into running all submitted jobs (SparkContexts) in one executor JVM.
>> 
>> Best Regards,
>> Jia
>> 
>> On Sun, Jan 17, 2016 at 2:21 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:
>> -dev
>> 
>> What do you mean by JobContext?  That is a Hadoop mapreduce concept, not Spark.
>> 
>> On Sun, Jan 17, 2016 at 7:29 AM, Jia Zou <ja...@gmail.com> wrote:
>> Dear all,
>> 
>> Is there a way to reuse executor JVM across different JobContexts? Thanks.
>> 
>> Best Regards,
>> Jia
>> 
>> 
>> 
> 
> 


Re: Reuse Executor JVM across different JobContext

Posted by Mark Hamstra <ma...@clearstorydata.com>.
There is a 1-to-1 relationship between Spark Applications and SparkContexts
-- fundamentally, a Spark Applications is a program that creates and uses a
SparkContext, and that SparkContext is destroyed when then Application
ends.  A jobserver generically and the Spark JobServer specifically is an
Application that keeps a SparkContext open for a long time and allows many
Jobs to be be submitted and run using that shared SparkContext.

More than one Application/SparkContext unavoidably implies more than one
JVM process per Worker -- Applications/SparkContexts cannot share JVM
processes.

On Sun, Jan 17, 2016 at 1:15 PM, Jia <ja...@gmail.com> wrote:

> Hi, Mark, sorry for the confusion.
>
> Let me clarify, when an application is submitted, the master will tell
> each Spark worker to spawn an executor JVM process. All the task sets  of
> the application will be executed by the executor. After the application
> runs to completion. The executor process will be killed.
> But I hope that all applications submitted can run in the same executor,
> can JobServer do that? If so, it’s really good news!
>
> Best Regards,
> Jia
>
> On Jan 17, 2016, at 3:09 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:
>
> You've still got me confused.  The SparkContext exists at the Driver, not
> on an Executor.
>
> Many Jobs can be run by a SparkContext -- it is a common pattern to use
> something like the Spark Jobserver where all Jobs are run through a shared
> SparkContext.
>
> On Sun, Jan 17, 2016 at 12:57 PM, Jia Zou <ja...@gmail.com> wrote:
>
>> Hi, Mark, sorry, I mean SparkContext.
>> I mean to change Spark into running all submitted jobs (SparkContexts) in
>> one executor JVM.
>>
>> Best Regards,
>> Jia
>>
>> On Sun, Jan 17, 2016 at 2:21 PM, Mark Hamstra <ma...@clearstorydata.com>
>> wrote:
>>
>>> -dev
>>>
>>> What do you mean by JobContext?  That is a Hadoop mapreduce concept, not
>>> Spark.
>>>
>>> On Sun, Jan 17, 2016 at 7:29 AM, Jia Zou <ja...@gmail.com>
>>> wrote:
>>>
>>>> Dear all,
>>>>
>>>> Is there a way to reuse executor JVM across different JobContexts?
>>>> Thanks.
>>>>
>>>> Best Regards,
>>>> Jia
>>>>
>>>
>>>
>>
>
>

Re: Reuse Executor JVM across different JobContext

Posted by Jia <ja...@gmail.com>.
Hi, Mark, sorry for the confusion.

Let me clarify, when an application is submitted, the master will tell each Spark worker to spawn an executor JVM process. All the task sets  of the application will be executed by the executor. After the application runs to completion. The executor process will be killed.
But I hope that all applications submitted can run in the same executor, can JobServer do that? If so, it’s really good news!

Best Regards,
Jia

On Jan 17, 2016, at 3:09 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:

> You've still got me confused.  The SparkContext exists at the Driver, not on an Executor.
> 
> Many Jobs can be run by a SparkContext -- it is a common pattern to use something like the Spark Jobserver where all Jobs are run through a shared SparkContext.
> 
> On Sun, Jan 17, 2016 at 12:57 PM, Jia Zou <ja...@gmail.com> wrote:
> Hi, Mark, sorry, I mean SparkContext.
> I mean to change Spark into running all submitted jobs (SparkContexts) in one executor JVM.
> 
> Best Regards,
> Jia
> 
> On Sun, Jan 17, 2016 at 2:21 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:
> -dev
> 
> What do you mean by JobContext?  That is a Hadoop mapreduce concept, not Spark.
> 
> On Sun, Jan 17, 2016 at 7:29 AM, Jia Zou <ja...@gmail.com> wrote:
> Dear all,
> 
> Is there a way to reuse executor JVM across different JobContexts? Thanks.
> 
> Best Regards,
> Jia
> 
> 
> 


Re: Reuse Executor JVM across different JobContext

Posted by Jia <ja...@gmail.com>.
Hi, Mark, sorry for the confusion.

Let me clarify, when an application is submitted, the master will tell each Spark worker to spawn an executor JVM process. All the task sets  of the application will be executed by the executor. After the application runs to completion. The executor process will be killed.
But I hope that all applications submitted can run in the same executor, can JobServer do that? If so, it’s really good news!

Best Regards,
Jia

On Jan 17, 2016, at 3:09 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:

> You've still got me confused.  The SparkContext exists at the Driver, not on an Executor.
> 
> Many Jobs can be run by a SparkContext -- it is a common pattern to use something like the Spark Jobserver where all Jobs are run through a shared SparkContext.
> 
> On Sun, Jan 17, 2016 at 12:57 PM, Jia Zou <ja...@gmail.com> wrote:
> Hi, Mark, sorry, I mean SparkContext.
> I mean to change Spark into running all submitted jobs (SparkContexts) in one executor JVM.
> 
> Best Regards,
> Jia
> 
> On Sun, Jan 17, 2016 at 2:21 PM, Mark Hamstra <ma...@clearstorydata.com> wrote:
> -dev
> 
> What do you mean by JobContext?  That is a Hadoop mapreduce concept, not Spark.
> 
> On Sun, Jan 17, 2016 at 7:29 AM, Jia Zou <ja...@gmail.com> wrote:
> Dear all,
> 
> Is there a way to reuse executor JVM across different JobContexts? Thanks.
> 
> Best Regards,
> Jia
> 
> 
> 


Re: Reuse Executor JVM across different JobContext

Posted by Mark Hamstra <ma...@clearstorydata.com>.
You've still got me confused.  The SparkContext exists at the Driver, not
on an Executor.

Many Jobs can be run by a SparkContext -- it is a common pattern to use
something like the Spark Jobserver where all Jobs are run through a shared
SparkContext.

On Sun, Jan 17, 2016 at 12:57 PM, Jia Zou <ja...@gmail.com> wrote:

> Hi, Mark, sorry, I mean SparkContext.
> I mean to change Spark into running all submitted jobs (SparkContexts) in
> one executor JVM.
>
> Best Regards,
> Jia
>
> On Sun, Jan 17, 2016 at 2:21 PM, Mark Hamstra <ma...@clearstorydata.com>
> wrote:
>
>> -dev
>>
>> What do you mean by JobContext?  That is a Hadoop mapreduce concept, not
>> Spark.
>>
>> On Sun, Jan 17, 2016 at 7:29 AM, Jia Zou <ja...@gmail.com> wrote:
>>
>>> Dear all,
>>>
>>> Is there a way to reuse executor JVM across different JobContexts?
>>> Thanks.
>>>
>>> Best Regards,
>>> Jia
>>>
>>
>>
>

Re: Reuse Executor JVM across different JobContext

Posted by Jia Zou <ja...@gmail.com>.
Hi, Mark, sorry, I mean SparkContext.
I mean to change Spark into running all submitted jobs (SparkContexts) in
one executor JVM.

Best Regards,
Jia

On Sun, Jan 17, 2016 at 2:21 PM, Mark Hamstra <ma...@clearstorydata.com>
wrote:

> -dev
>
> What do you mean by JobContext?  That is a Hadoop mapreduce concept, not
> Spark.
>
> On Sun, Jan 17, 2016 at 7:29 AM, Jia Zou <ja...@gmail.com> wrote:
>
>> Dear all,
>>
>> Is there a way to reuse executor JVM across different JobContexts? Thanks.
>>
>> Best Regards,
>> Jia
>>
>
>

Re: Reuse Executor JVM across different JobContext

Posted by Mark Hamstra <ma...@clearstorydata.com>.
-dev

What do you mean by JobContext?  That is a Hadoop mapreduce concept, not
Spark.

On Sun, Jan 17, 2016 at 7:29 AM, Jia Zou <ja...@gmail.com> wrote:

> Dear all,
>
> Is there a way to reuse executor JVM across different JobContexts? Thanks.
>
> Best Regards,
> Jia
>