You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by UtkarshMe <gi...@git.apache.org> on 2018/10/25 09:08:51 UTC

[GitHub] spark pull request #22822: [SPARK-25678] Requesting feedback regarding a pro...

GitHub user UtkarshMe opened a pull request:

    https://github.com/apache/spark/pull/22822

    [SPARK-25678] Requesting feedback regarding a prototype for adding PBS Professional as a cluster manager

    ## What changes were proposed in this pull request?
    *From Spark [JIRA ticket](https://issues.apache.org/jira/browse/SPARK-25678):*  
      
    [PBS (Portable Batch System) Professional](https://github.com/pbspro/pbspro) is an open sourced workload management system for HPC clusters. Many organizations using PBS for managing their cluster also use Spark for Big Data but they are forced to divide the cluster into Spark cluster and PBS cluster either physically dividing the cluster nodes into two groups or starting Spark Standalone cluster manager's Master and Slaves as PBS jobs, leading to underutilization of resources.
    
    I am trying to add support in Spark to use PBS as a pluggable cluster manager. Going through the Spark codebase and looking at Mesos and Kubernetes integration, I found that we can get this working as follows:
    
    - Extend `ExternalClusterManager`.
    - Extend `CoarseGrainedSchedulerBackend`
      - This class can start `Executors` as PBS jobs.
      - The initial number of `Executors` are started `onStart`.
      - More `Executors` can be started as and when required using `doRequestTotalExecutors`.
      - `Executors` can be killed using `doKillExecutors`.
    - Extend `SparkApplication` to start `Driver` as a PBS job in cluster deploy mode.
      - This extended class can submit the Spark application again as a PBS job with deploy mode = client, so that the application driver is started on a node in the cluster.
    
    
    ## How was this patch tested?
    - Compiled with PBS support by sending `-Ppbs` flag to `build/mvn`.
    - I was able to run a basic `SparkPi`Java application with client and cluster deploy modes using `bin/spark-submit`:
    ```bash
    ./bin/spark-submit --master pbs --deploy-mode cluster --class org.apache.spark.examples.SparkPi spark-examples.jar 1000000
    ```
    - The TravisCI build seems to fail because of code lint/license comments
    
    
    ## I have a couple of questions:
    
    - Does this seem like a good idea to do this or should we look at other options?
    - What are the expectations from the initial prototype?
    - Would Spark maintainers look forward to merging this or would they want it to be maintained as a fork?
    
    CC: @sakshamgarg

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/UtkarshMe/spark pbs_support

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22822.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22822
    
----
commit e3a97ebfbb8862b3a83fa5d01b1a8a3bd191f456
Author: Utkarsh <ut...@...>
Date:   2018-08-29T13:39:01Z

    Add prototype for using PBS as external cluster manager

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22822: [SPARK-25678] Requesting feedback regarding a prototype ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22822
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22822: [SPARK-25678] Requesting feedback regarding a prototype ...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/22822
  
    @UtkarshMe you should reach out to the spark dev list about this.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22822: [SPARK-25678] Requesting feedback regarding a prototype ...

Posted by UtkarshMe <gi...@git.apache.org>.
Github user UtkarshMe commented on the issue:

    https://github.com/apache/spark/pull/22822
  
    I did send the proposal on dev@spark.apache.org mailing list (twice). But unfortunately, I got no response so I opened a JIRA ticket about it about 20 days back and now opened a pull request for feedback.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22822: [SPARK-25678] Requesting feedback regarding a prototype ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22822
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22822: [SPARK-25678] Requesting feedback regarding a prototype ...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/22822
  
    @UtkarshMe well there is signal in the lack of responsiveness. Adding and maintaining cluster managers has proven to be quite painful, case and point is the lack of love that Mesos is receiving. I don't really see a way forward here unless there is strong consensus in the community.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22822: [SPARK-25678] Requesting feedback regarding a prototype ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22822
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22822: [SPARK-25678] Requesting feedback regarding a prototype ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22822
  
    Yeah, I can't imagine merging support for any other resource manager I know of now; it's just way too much to maintain. I have not heard of this one myself. It should be implemented outside Spark.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22822: [SPARK-25678] Requesting feedback regarding a pro...

Posted by UtkarshMe <gi...@git.apache.org>.
Github user UtkarshMe closed the pull request at:

    https://github.com/apache/spark/pull/22822


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org