You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Alessandro Baretta <al...@gmail.com> on 2014/12/22 22:32:56 UTC
More general submitJob API
Fellow Sparkers,
I'm rather puzzled at the submitJob API. I can't quite figure out how it is
supposed to be used. Is there any more documentation about it?
Also, is there any simpler way to multiplex jobs on the cluster, such as
starting multiple computations in as many threads in the driver and reaping
all the results when they are available?
Thanks,
Alex
Re: More general submitJob API
Posted by Patrick Wendell <pw...@gmail.com>.
A SparkContext is thread safe, so you can just have different threads
that create their own RDD's and do actions, etc.
- Patrick
On Mon, Dec 22, 2014 at 4:15 PM, Alessandro Baretta
<al...@gmail.com> wrote:
> Andrew,
>
> Thanks, yes, this is what I wanted: basically just to start multiple jobs
> concurrently in threads.
>
> Alex
>
> On Mon, Dec 22, 2014 at 4:04 PM, Andrew Ash <an...@andrewash.com> wrote:
>>
>> Hi Alex,
>>
>> SparkContext.submitJob() is marked as experimental -- most client programs
>> shouldn't be using it. What are you looking to do?
>>
>> For multiplexing jobs, one thing you can do is have multiple threads in
>> your client JVM each submit jobs on your SparkContext job. This is
>> described here in the docs:
>> http://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application
>>
>> Andrew
>>
>> On Mon, Dec 22, 2014 at 1:32 PM, Alessandro Baretta <alexbaretta@gmail.com
>> > wrote:
>>
>>> Fellow Sparkers,
>>>
>>> I'm rather puzzled at the submitJob API. I can't quite figure out how it
>>> is
>>> supposed to be used. Is there any more documentation about it?
>>>
>>> Also, is there any simpler way to multiplex jobs on the cluster, such as
>>> starting multiple computations in as many threads in the driver and
>>> reaping
>>> all the results when they are available?
>>>
>>> Thanks,
>>>
>>> Alex
>>>
>>
>>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org
Re: More general submitJob API
Posted by Alessandro Baretta <al...@gmail.com>.
Andrew,
Thanks, yes, this is what I wanted: basically just to start multiple jobs
concurrently in threads.
Alex
On Mon, Dec 22, 2014 at 4:04 PM, Andrew Ash <an...@andrewash.com> wrote:
>
> Hi Alex,
>
> SparkContext.submitJob() is marked as experimental -- most client programs
> shouldn't be using it. What are you looking to do?
>
> For multiplexing jobs, one thing you can do is have multiple threads in
> your client JVM each submit jobs on your SparkContext job. This is
> described here in the docs:
> http://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application
>
> Andrew
>
> On Mon, Dec 22, 2014 at 1:32 PM, Alessandro Baretta <alexbaretta@gmail.com
> > wrote:
>
>> Fellow Sparkers,
>>
>> I'm rather puzzled at the submitJob API. I can't quite figure out how it
>> is
>> supposed to be used. Is there any more documentation about it?
>>
>> Also, is there any simpler way to multiplex jobs on the cluster, such as
>> starting multiple computations in as many threads in the driver and
>> reaping
>> all the results when they are available?
>>
>> Thanks,
>>
>> Alex
>>
>
>
Re: More general submitJob API
Posted by Andrew Ash <an...@andrewash.com>.
Hi Alex,
SparkContext.submitJob() is marked as experimental -- most client programs
shouldn't be using it. What are you looking to do?
For multiplexing jobs, one thing you can do is have multiple threads in
your client JVM each submit jobs on your SparkContext job. This is
described here in the docs:
http://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application
Andrew
On Mon, Dec 22, 2014 at 1:32 PM, Alessandro Baretta <al...@gmail.com>
wrote:
> Fellow Sparkers,
>
> I'm rather puzzled at the submitJob API. I can't quite figure out how it is
> supposed to be used. Is there any more documentation about it?
>
> Also, is there any simpler way to multiplex jobs on the cluster, such as
> starting multiple computations in as many threads in the driver and reaping
> all the results when they are available?
>
> Thanks,
>
> Alex
>