You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Aniket Bhatnagar <an...@gmail.com> on 2015/01/12 14:35:07 UTC
Discussion | SparkContext 's setJobGroup and clearJobGroup should
return a new instance of SparkContext
Hi spark committers
I would like to discuss the possibility of changing the signature
of SparkContext 's setJobGroup and clearJobGroup functions to return a
replica of SparkContext with the job group set/unset instead of mutating
the original context. I am building a spark job server and I am assigning
job groups before passing control to user provided logic that uses spark
context to define and execute a job (very much like job-server). The issue
is that I can't reliably know when to clear the job group as user defined
code can use futures to submit multiple tasks in parallel. In fact, I am
even allowing users to return a future from their function on which spark
server can register callbacks to know when the user defined job is
complete. Now, if I set the job group before passing control to user
function and wait on future to complete so that I can clear the job group,
I can no longer use that SparkContext for any other job. This means I will
have to lock on the SparkContext which seems like a bad idea. Therefore, my
proposal would be to return new instance of SparkContext (a replica with
just job group set/unset) that can further be used in concurrent
environment safely. I am also happy mutating the original SparkContext just
not break backward compatibility as long as the returned SparkContext is
not affected by set/unset of job groups on original SparkContext.
Thoughts please?
Thanks,
Aniket
Re: Discussion | SparkContext 's setJobGroup and clearJobGroup
should return a new instance of SparkContext
Posted by Erik Erlandson <ej...@redhat.com>.
setJobGroup needs fixing:
https://issues.apache.org/jira/browse/SPARK-4514
I'm interested in any community input on what the semantics or design "ought" to be changed to.
----- Original Message -----
> Hi spark committers
>
> I would like to discuss the possibility of changing the signature
> of SparkContext 's setJobGroup and clearJobGroup functions to return a
> replica of SparkContext with the job group set/unset instead of mutating
> the original context. I am building a spark job server and I am assigning
> job groups before passing control to user provided logic that uses spark
> context to define and execute a job (very much like job-server). The issue
> is that I can't reliably know when to clear the job group as user defined
> code can use futures to submit multiple tasks in parallel. In fact, I am
> even allowing users to return a future from their function on which spark
> server can register callbacks to know when the user defined job is
> complete. Now, if I set the job group before passing control to user
> function and wait on future to complete so that I can clear the job group,
> I can no longer use that SparkContext for any other job. This means I will
> have to lock on the SparkContext which seems like a bad idea. Therefore, my
> proposal would be to return new instance of SparkContext (a replica with
> just job group set/unset) that can further be used in concurrent
> environment safely. I am also happy mutating the original SparkContext just
> not break backward compatibility as long as the returned SparkContext is
> not affected by set/unset of job groups on original SparkContext.
>
> Thoughts please?
>
> Thanks,
> Aniket
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org