You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Alessandro Baretta <al...@gmail.com> on 2014/12/22 22:32:56 UTC

More general submitJob API

Fellow Sparkers,

I'm rather puzzled at the submitJob API. I can't quite figure out how it is
supposed to be used. Is there any more documentation about it?

Also, is there any simpler way to multiplex jobs on the cluster, such as
starting multiple computations in as many threads in the driver and reaping
all the results when they are available?

Thanks,

Alex

Re: More general submitJob API

Posted by Patrick Wendell <pw...@gmail.com>.
A SparkContext is thread safe, so you can just have different threads
that create their own RDD's and do actions, etc.

- Patrick

On Mon, Dec 22, 2014 at 4:15 PM, Alessandro Baretta
<al...@gmail.com> wrote:
> Andrew,
>
> Thanks, yes, this is what I wanted: basically just to start multiple jobs
> concurrently in threads.
>
> Alex
>
> On Mon, Dec 22, 2014 at 4:04 PM, Andrew Ash <an...@andrewash.com> wrote:
>>
>> Hi Alex,
>>
>> SparkContext.submitJob() is marked as experimental -- most client programs
>> shouldn't be using it.  What are you looking to do?
>>
>> For multiplexing jobs, one thing you can do is have multiple threads in
>> your client JVM each submit jobs on your SparkContext job.  This is
>> described here in the docs:
>> http://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application
>>
>> Andrew
>>
>> On Mon, Dec 22, 2014 at 1:32 PM, Alessandro Baretta <alexbaretta@gmail.com
>> > wrote:
>>
>>> Fellow Sparkers,
>>>
>>> I'm rather puzzled at the submitJob API. I can't quite figure out how it
>>> is
>>> supposed to be used. Is there any more documentation about it?
>>>
>>> Also, is there any simpler way to multiplex jobs on the cluster, such as
>>> starting multiple computations in as many threads in the driver and
>>> reaping
>>> all the results when they are available?
>>>
>>> Thanks,
>>>
>>> Alex
>>>
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: More general submitJob API

Posted by Alessandro Baretta <al...@gmail.com>.
Andrew,

Thanks, yes, this is what I wanted: basically just to start multiple jobs
concurrently in threads.

Alex

On Mon, Dec 22, 2014 at 4:04 PM, Andrew Ash <an...@andrewash.com> wrote:
>
> Hi Alex,
>
> SparkContext.submitJob() is marked as experimental -- most client programs
> shouldn't be using it.  What are you looking to do?
>
> For multiplexing jobs, one thing you can do is have multiple threads in
> your client JVM each submit jobs on your SparkContext job.  This is
> described here in the docs:
> http://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application
>
> Andrew
>
> On Mon, Dec 22, 2014 at 1:32 PM, Alessandro Baretta <alexbaretta@gmail.com
> > wrote:
>
>> Fellow Sparkers,
>>
>> I'm rather puzzled at the submitJob API. I can't quite figure out how it
>> is
>> supposed to be used. Is there any more documentation about it?
>>
>> Also, is there any simpler way to multiplex jobs on the cluster, such as
>> starting multiple computations in as many threads in the driver and
>> reaping
>> all the results when they are available?
>>
>> Thanks,
>>
>> Alex
>>
>
>

Re: More general submitJob API

Posted by Andrew Ash <an...@andrewash.com>.
Hi Alex,

SparkContext.submitJob() is marked as experimental -- most client programs
shouldn't be using it.  What are you looking to do?

For multiplexing jobs, one thing you can do is have multiple threads in
your client JVM each submit jobs on your SparkContext job.  This is
described here in the docs:
http://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application

Andrew

On Mon, Dec 22, 2014 at 1:32 PM, Alessandro Baretta <al...@gmail.com>
wrote:

> Fellow Sparkers,
>
> I'm rather puzzled at the submitJob API. I can't quite figure out how it is
> supposed to be used. Is there any more documentation about it?
>
> Also, is there any simpler way to multiplex jobs on the cluster, such as
> starting multiple computations in as many threads in the driver and reaping
> all the results when they are available?
>
> Thanks,
>
> Alex
>