You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by ifilonenko <gi...@git.apache.org> on 2018/08/31 04:46:28 UTC

[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

GitHub user ifilonenko opened a pull request:

    https://github.com/apache/spark/pull/22298

    [SPARK-25021][K8S] Add spark.executor.pyspark.memory limit for K8S

    ## What changes were proposed in this pull request?
    
    Add spark.executor.pyspark.memory limit for K8S
    
    ## How was this patch tested?
    
    Unit and Integration tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ifilonenko/spark SPARK-25021

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22298.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22298
    
----
commit b54a039da08aec93a6db9d1470d0b2eaaec08814
Author: Ilan Filonenko <if...@...>
Date:   2018-08-30T00:19:40Z

    initial WIP push for SPARK-25021

commit 75742a37687a7eb3ebaa34069ac7a62521a4e2f8
Author: Ilan Filonenko <if...@...>
Date:   2018-08-30T05:26:27Z

    add python.worker.reuse

commit 46c30cc27cd3a7279a116ec6a70a937b8502cd73
Author: Ilan Filonenko <if...@...>
Date:   2018-08-31T04:32:22Z

    final checks with e2e tests

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214492297
  
    --- Diff: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PythonTestsSuite.scala ---
    @@ -72,12 +72,33 @@ private[spark] trait PythonTestsSuite { k8sSuite: KubernetesSuite =>
           isJVM = false,
           pyFiles = Some(PYSPARK_CONTAINER_TESTS))
       }
    +
    +  test("Run PySpark with memory customization", k8sTestTag) {
    +    sparkAppConf
    +      .set("spark.kubernetes.container.image", s"${getTestImageRepo}/spark-py:${getTestImageTag}")
    --- End diff --
    
    Some of this stuff can be factored out I think, we just haven't done so yet. I wouldn't block a merge of this on such a refactor, but this entire test class could probably use some cleanup with respect to how the code is structured.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Kubernetes integration test starting
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2754/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Kubernetes integration test status success
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2934/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Kubernetes integration test starting
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2934/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214411915
  
    --- Diff: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PythonTestsSuite.scala ---
    @@ -72,12 +72,33 @@ private[spark] trait PythonTestsSuite { k8sSuite: KubernetesSuite =>
           isJVM = false,
           pyFiles = Some(PYSPARK_CONTAINER_TESTS))
       }
    +
    +  test("Run PySpark with memory customization", k8sTestTag) {
    +    sparkAppConf
    +      .set("spark.kubernetes.container.image", s"${getTestImageRepo}/spark-py:${getTestImageTag}")
    +      .set("spark.kubernetes.pyspark.pythonVersion", "3")
    +      .set("spark.kubernetes.memoryOverheadFactor", s"$memOverheadConstant")
    +      .set("spark.executor.pyspark.memory", s"${additionalMemory}m")
    +      .set("spark.python.worker.reuse", "false")
    --- End diff --
    
    in reference to https://github.com/apache/spark/pull/21977#issuecomment-411130872 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    **[Test build #95519 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95519/testReport)** for PR 22298 at commit [`46c30cc`](https://github.com/apache/spark/commit/46c30cc27cd3a7279a116ec6a70a937b8502cd73).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214405118
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala ---
    @@ -58,6 +58,15 @@ private[spark] class BasicExecutorFeatureStep(
           (kubernetesConf.get(MEMORY_OVERHEAD_FACTOR) * executorMemoryMiB).toInt,
           MEMORY_OVERHEAD_MIN_MIB))
       private val executorMemoryWithOverhead = executorMemoryMiB + memoryOverheadMiB
    +  // TODO: Have memory limit checks on executorMemory
    --- End diff --
    
    Same as other TODO


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214411102
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala ---
    @@ -51,6 +51,7 @@ private[spark] class BasicDriverFeatureStep(
         .get(DRIVER_MEMORY_OVERHEAD)
         .getOrElse(math.max((conf.get(MEMORY_OVERHEAD_FACTOR) * driverMemoryMiB).toInt,
           MEMORY_OVERHEAD_MIN_MIB))
    +  // TODO: Have memory limit checks on driverMemory
    --- End diff --
    
    I wanted to get an opinion from people (@mccheah) into whether or not we wanted to let the K8S api to handle memory limits (via ResourceQuota limit errors) or whether we wanted to catch it in a Spark exception (if we were to include a configuration for memory limits)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    **[Test build #95570 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95570/testReport)** for PR 22298 at commit [`3ad1324`](https://github.com/apache/spark/commit/3ad13242f0f5c5d34cda8ca5d0e0ad0b1bdbcead).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214407013
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/bindings/PythonDriverFeatureStep.scala ---
    @@ -67,7 +68,8 @@ private[spark] class PythonDriverFeatureStep(
     
         SparkPod(pod.pod, withPythonPrimaryContainer)
       }
    -  override def getAdditionalPodSystemProperties(): Map[String, String] = Map.empty
    +  override def getAdditionalPodSystemProperties(): Map[String, String] =
    --- End diff --
    
    Same as above (just curious if this overrides user params or precedence rules here). Also add a comment of the precedence rules.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214404961
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala ---
    @@ -225,6 +225,13 @@ private[spark] object Config extends Logging {
             "Ensure that major Python version is either Python2 or Python3")
           .createWithDefault("2")
     
    +  val APP_RESOURCE_TYPE =
    --- End diff --
    
    Why this instead of the bools? What about folks who want to make a pipeline which is both R and Python?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    @mccheah as such, that should be a followup PR and this should be good to merge as long as  @holdenk gives a LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    @rdblue @holdenk for review. This contains both unit and integration tests that verify [SPARK-25004] for K8S


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Kubernetes integration test starting
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2751/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214408969
  
    --- Diff: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PythonTestsSuite.scala ---
    @@ -72,12 +72,33 @@ private[spark] trait PythonTestsSuite { k8sSuite: KubernetesSuite =>
           isJVM = false,
           pyFiles = Some(PYSPARK_CONTAINER_TESTS))
       }
    +
    +  test("Run PySpark with memory customization", k8sTestTag) {
    +    sparkAppConf
    +      .set("spark.kubernetes.container.image", s"${getTestImageRepo}/spark-py:${getTestImageTag}")
    --- End diff --
    
    nit: Mildly confused why there's so many sets here (like the image, etc.) maybe make more sense in a shared test setup func?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214411403
  
    --- Diff: resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/R/Dockerfile ---
    @@ -19,10 +19,10 @@ ARG base_img
     FROM $base_img
     WORKDIR /
     RUN mkdir ${SPARK_HOME}/R
    -COPY R ${SPARK_HOME}/R
     
     RUN apk add --no-cache R R-dev
     
    +COPY R ${SPARK_HOME}/R
    --- End diff --
    
    This makes the docker build not run the `R` and `R-dev` installations each time an update to the jar is made. This is a minor change that helps with dev :) 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    @holdenk and @mccheah any other comments before merge? 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214406804
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/bindings/JavaDriverFeatureStep.scala ---
    @@ -38,7 +39,8 @@ private[spark] class JavaDriverFeatureStep(
           .build()
         SparkPod(pod.pod, withDriverArgs)
       }
    -  override def getAdditionalPodSystemProperties(): Map[String, String] = Map.empty
    +  override def getAdditionalPodSystemProperties(): Map[String, String] =
    --- End diff --
    
    So is this going to override it if the user wants to set a different type even if they are ostensibly running a JVM pod (e.g. mixed language pipelines).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214492080
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/bindings/JavaDriverFeatureStep.scala ---
    @@ -38,7 +39,8 @@ private[spark] class JavaDriverFeatureStep(
           .build()
         SparkPod(pod.pod, withDriverArgs)
       }
    -  override def getAdditionalPodSystemProperties(): Map[String, String] = Map.empty
    +  override def getAdditionalPodSystemProperties(): Map[String, String] =
    --- End diff --
    
    Again, not sure what our direction is going to be with respect to mixed pipelines throughout the cluster manager sections - if we should be supporting it in a first class way then perhaps we file a JIRA and we can discuss how the submission client should be refactored to support that.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    **[Test build #95809 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95809/testReport)** for PR 22298 at commit [`fe8cc5a`](https://github.com/apache/spark/commit/fe8cc5aa6759cdf893e11c3d83814f8dffddce9c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2848/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    @felixcheung @holdenk I have moved the PySpark example files to a more appropriate location. Any other comments before merge? 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Kubernetes integration test starting
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2848/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Kubernetes integration test status success
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2848/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95570/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95809/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95519/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    **[Test build #95564 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95564/testReport)** for PR 22298 at commit [`467703b`](https://github.com/apache/spark/commit/467703bfb8bde570b5b0cad11904cb92641dcd84).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    @holdenk or @mccheah for merge 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Kubernetes integration test status success
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2751/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214530079
  
    --- Diff: examples/src/main/python/worker_memory_check.py ---
    @@ -0,0 +1,47 @@
    +#
    --- End diff --
    
    shouldn't this be in python tests (and get it to run only certain cluster manager)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214411589
  
    --- Diff: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PythonTestsSuite.scala ---
    @@ -72,12 +72,33 @@ private[spark] trait PythonTestsSuite { k8sSuite: KubernetesSuite =>
           isJVM = false,
           pyFiles = Some(PYSPARK_CONTAINER_TESTS))
       }
    +
    +  test("Run PySpark with memory customization", k8sTestTag) {
    +    sparkAppConf
    +      .set("spark.kubernetes.container.image", s"${getTestImageRepo}/spark-py:${getTestImageTag}")
    +      .set("spark.kubernetes.pyspark.pythonVersion", "3")
    +      .set("spark.kubernetes.memoryOverheadFactor", s"$memOverheadConstant")
    --- End diff --
    
    I can add to docs, sure!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95813/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    **[Test build #95813 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95813/testReport)** for PR 22298 at commit [`ea25b8b`](https://github.com/apache/spark/commit/ea25b8b14013d17fbafe5c338cca0eb28a490482).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22298


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    +1 for 2.4


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95564/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Yeah this looks ok to me, would like a +1 from @felixcheung and @holdenk.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214409131
  
    --- Diff: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PythonTestsSuite.scala ---
    @@ -72,12 +72,33 @@ private[spark] trait PythonTestsSuite { k8sSuite: KubernetesSuite =>
           isJVM = false,
           pyFiles = Some(PYSPARK_CONTAINER_TESTS))
       }
    +
    +  test("Run PySpark with memory customization", k8sTestTag) {
    +    sparkAppConf
    +      .set("spark.kubernetes.container.image", s"${getTestImageRepo}/spark-py:${getTestImageTag}")
    +      .set("spark.kubernetes.pyspark.pythonVersion", "3")
    +      .set("spark.kubernetes.memoryOverheadFactor", s"$memOverheadConstant")
    +      .set("spark.executor.pyspark.memory", s"${additionalMemory}m")
    +      .set("spark.python.worker.reuse", "false")
    --- End diff --
    
    Why is worker reuse being set to false?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214415839
  
    --- Diff: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PythonTestsSuite.scala ---
    @@ -72,12 +72,33 @@ private[spark] trait PythonTestsSuite { k8sSuite: KubernetesSuite =>
           isJVM = false,
           pyFiles = Some(PYSPARK_CONTAINER_TESTS))
       }
    +
    +  test("Run PySpark with memory customization", k8sTestTag) {
    +    sparkAppConf
    +      .set("spark.kubernetes.container.image", s"${getTestImageRepo}/spark-py:${getTestImageTag}")
    +      .set("spark.kubernetes.pyspark.pythonVersion", "3")
    +      .set("spark.kubernetes.memoryOverheadFactor", s"$memOverheadConstant")
    --- End diff --
    
    This is already included in the docs of `docs/running-on-kubernetes.md`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r216112524
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala ---
    @@ -225,6 +225,13 @@ private[spark] object Config extends Logging {
             "Ensure that major Python version is either Python2 or Python3")
           .createWithDefault("2")
     
    +  val APP_RESOURCE_TYPE =
    --- End diff --
    
    Yeah lets do something in a follow up after 2.4


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214407418
  
    --- Diff: resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/python/Dockerfile ---
    @@ -33,6 +32,7 @@ RUN apk add --no-cache python && \
         # Removed the .cache to save space
         rm -r /root/.cache
     
    +COPY python/lib ${SPARK_HOME}/python/lib
    --- End diff --
    
    Same, is this change intentional


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214405048
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala ---
    @@ -51,6 +51,7 @@ private[spark] class BasicDriverFeatureStep(
         .get(DRIVER_MEMORY_OVERHEAD)
         .getOrElse(math.max((conf.get(MEMORY_OVERHEAD_FACTOR) * driverMemoryMiB).toInt,
           MEMORY_OVERHEAD_MIN_MIB))
    +  // TODO: Have memory limit checks on driverMemory
    --- End diff --
    
    Can we file a JIRA as well and include it in the comment.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2937/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Kubernetes integration test starting
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2937/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2934/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214416261
  
    --- Diff: examples/src/main/python/worker_memory_check.py ---
    @@ -0,0 +1,47 @@
    +#
    --- End diff --
    
    Easiest place to put to be trivially picked up by integration tests, as I did with pyfiles, but I am open to recommendations. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Kubernetes integration test status success
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2754/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2724/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2754/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214495391
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala ---
    @@ -51,6 +51,7 @@ private[spark] class BasicDriverFeatureStep(
         .get(DRIVER_MEMORY_OVERHEAD)
         .getOrElse(math.max((conf.get(MEMORY_OVERHEAD_FACTOR) * driverMemoryMiB).toInt,
           MEMORY_OVERHEAD_MIN_MIB))
    +  // TODO: Have memory limit checks on driverMemory
    --- End diff --
    
    Valid point. These are not necessary


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214550394
  
    --- Diff: examples/src/main/python/worker_memory_check.py ---
    @@ -0,0 +1,47 @@
    +#
    --- End diff --
    
    I think the concern here is shipping a test as an example - this is the place where dev will be looking for example on how to use pyspark and having a memory test there is a bit strange.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    **[Test build #95564 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95564/testReport)** for PR 22298 at commit [`467703b`](https://github.com/apache/spark/commit/467703bfb8bde570b5b0cad11904cb92641dcd84).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    **[Test build #95813 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95813/testReport)** for PR 22298 at commit [`ea25b8b`](https://github.com/apache/spark/commit/ea25b8b14013d17fbafe5c338cca0eb28a490482).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    **[Test build #95686 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95686/testReport)** for PR 22298 at commit [`7dc26ce`](https://github.com/apache/spark/commit/7dc26ce7a06715a90950e57ee5519979fef4fc27).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    https://issues.apache.org/jira/browse/SPARK-25291 looks like a real issue with the way the tests are written. So we don't necessarily want to ignore it for this patch, but we're still thinking about it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95686/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r216112853
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/bindings/JavaDriverFeatureStep.scala ---
    @@ -38,7 +39,8 @@ private[spark] class JavaDriverFeatureStep(
           .build()
         SparkPod(pod.pod, withDriverArgs)
       }
    -  override def getAdditionalPodSystemProperties(): Map[String, String] = Map.empty
    +  override def getAdditionalPodSystemProperties(): Map[String, String] =
    --- End diff --
    
    filed SPARK-25373 - Support mixed language pipelines on Spark on K8s


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214411462
  
    --- Diff: resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/python/Dockerfile ---
    @@ -33,6 +32,7 @@ RUN apk add --no-cache python && \
         # Removed the .cache to save space
         rm -r /root/.cache
     
    +COPY python/lib ${SPARK_HOME}/python/lib
    --- End diff --
    
    Same as the R change above ^^


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214409455
  
    --- Diff: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PythonTestsSuite.scala ---
    @@ -72,12 +72,33 @@ private[spark] trait PythonTestsSuite { k8sSuite: KubernetesSuite =>
           isJVM = false,
           pyFiles = Some(PYSPARK_CONTAINER_TESTS))
       }
    +
    +  test("Run PySpark with memory customization", k8sTestTag) {
    +    sparkAppConf
    +      .set("spark.kubernetes.container.image", s"${getTestImageRepo}/spark-py:${getTestImageTag}")
    +      .set("spark.kubernetes.pyspark.pythonVersion", "3")
    +      .set("spark.kubernetes.memoryOverheadFactor", s"$memOverheadConstant")
    --- End diff --
    
    Do we expect people who configure the rlimit advanced feature to also set the memoryOverheadConstant to a different value? If so we should call it out in the docs. (note: I think it would make sense for folks to set this to a lower value so I _think_ this would be the expected behaviour and we should document but open to suggestions)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Kubernetes integration test status success
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2724/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214410560
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala ---
    @@ -225,6 +225,13 @@ private[spark] object Config extends Logging {
             "Ensure that major Python version is either Python2 or Python3")
           .createWithDefault("2")
     
    +  val APP_RESOURCE_TYPE =
    --- End diff --
    
    The reason for this is because we are already running binding steps that configure the driver based on the app resource. I thought it might as well pass the config down into the executors upon doing that binding bootstrap step. 
    
    Currently, we don't have any docker files that handle mixed pipelines so such configurations should be addressed in a followup-PR, imo. But I am open to suggestions (that are testable). 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2751/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    **[Test build #95570 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95570/testReport)** for PR 22298 at commit [`3ad1324`](https://github.com/apache/spark/commit/3ad13242f0f5c5d34cda8ca5d0e0ad0b1bdbcead).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214491727
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala ---
    @@ -51,6 +51,7 @@ private[spark] class BasicDriverFeatureStep(
         .get(DRIVER_MEMORY_OVERHEAD)
         .getOrElse(math.max((conf.get(MEMORY_OVERHEAD_FACTOR) * driverMemoryMiB).toInt,
           MEMORY_OVERHEAD_MIN_MIB))
    +  // TODO: Have memory limit checks on driverMemory
    --- End diff --
    
    Hm can you elaborate here? We already set the driver memory limit in this step based on the overhead.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Looks fine to me, but I'm not familiar enough with the K8S code to have much of an opinion.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    **[Test build #95686 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95686/testReport)** for PR 22298 at commit [`7dc26ce`](https://github.com/apache/spark/commit/7dc26ce7a06715a90950e57ee5519979fef4fc27).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    LGTM pending jenkins, see previous comments for details.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Merged to master (e.g. 3). It's not a bug fix but I _think_ we should consider this for backport to 2.4 since it's arguably the second half of a feature that's in 2.4, but it's doesn't backport cleanly as is so maybe another PR just for the 2.4 branch.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214407357
  
    --- Diff: resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/R/Dockerfile ---
    @@ -19,10 +19,10 @@ ARG base_img
     FROM $base_img
     WORKDIR /
     RUN mkdir ${SPARK_HOME}/R
    -COPY R ${SPARK_HOME}/R
     
     RUN apk add --no-cache R R-dev
     
    +COPY R ${SPARK_HOME}/R
    --- End diff --
    
    Is this change intentional?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214405499
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala ---
    @@ -58,6 +58,15 @@ private[spark] class BasicExecutorFeatureStep(
           (kubernetesConf.get(MEMORY_OVERHEAD_FACTOR) * executorMemoryMiB).toInt,
           MEMORY_OVERHEAD_MIN_MIB))
       private val executorMemoryWithOverhead = executorMemoryMiB + memoryOverheadMiB
    +  // TODO: Have memory limit checks on executorMemory
    +  private val executorMemoryTotal = kubernetesConf.sparkConf
    +    .getOption(APP_RESOURCE_TYPE.key).map{ res =>
    +      val additionalPySparkMemory = if (res == "python") {
    --- End diff --
    
    So this means that we couldn't turn this on in a mixed language pipeline even if the pipeline author did some fun hacks (since we might need to support Scala/Python/R all at the same time). Is there a reason we did this instead of the YARN approach of isPython ?
    
    Also I'd write this as a case statement instead perhaps style wise, what do you think?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r216113001
  
    --- Diff: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PythonTestsSuite.scala ---
    @@ -72,12 +74,33 @@ private[spark] trait PythonTestsSuite { k8sSuite: KubernetesSuite =>
           isJVM = false,
           pyFiles = Some(PYSPARK_CONTAINER_TESTS))
       }
    +
    +  test("Run PySpark with memory customization", k8sTestTag) {
    +    sparkAppConf
    +      .set("spark.kubernetes.container.image", pySparkDockerImage)
    +      .set("spark.kubernetes.pyspark.pythonVersion", "3")
    +      .set("spark.kubernetes.memoryOverheadFactor", s"$memOverheadConstant")
    +      .set("spark.executor.pyspark.memory", s"${additionalMemory}m")
    +      .set("spark.python.worker.reuse", "false")
    --- End diff --
    
    I don't believe this should be set. Worker reuse is on by default in most systems so not sure if this test depends on worker reuse being false. As per @rdblue's investigation this shouldn't impact this code path (and if it does we need to re-open that investigation).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Kubernetes integration test starting
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2724/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214543937
  
    --- Diff: examples/src/main/python/worker_memory_check.py ---
    @@ -0,0 +1,47 @@
    +#
    --- End diff --
    
    That might be a good place for it. But the reason for this being in examples is that the integration tests can access this locally and we can see the success in the Jenkins environment. The K8s integration test suite does not run python run-tests.py which means that this test would therefore not be part of the PRB. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    **[Test build #95809 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95809/testReport)** for PR 22298 at commit [`fe8cc5a`](https://github.com/apache/spark/commit/fe8cc5aa6759cdf893e11c3d83814f8dffddce9c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22298
  
    Kubernetes integration test status success
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2937/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214407128
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/bindings/RDriverFeatureStep.scala ---
    @@ -53,7 +54,8 @@ private[spark] class RDriverFeatureStep(
     
         SparkPod(pod.pod, withRPrimaryContainer)
       }
    -  override def getAdditionalPodSystemProperties(): Map[String, String] = Map.empty
    +  override def getAdditionalPodSystemProperties(): Map[String, String] =
    --- End diff --
    
    Same


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214411255
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala ---
    @@ -58,6 +58,15 @@ private[spark] class BasicExecutorFeatureStep(
           (kubernetesConf.get(MEMORY_OVERHEAD_FACTOR) * executorMemoryMiB).toInt,
           MEMORY_OVERHEAD_MIN_MIB))
       private val executorMemoryWithOverhead = executorMemoryMiB + memoryOverheadMiB
    +  // TODO: Have memory limit checks on executorMemory
    +  private val executorMemoryTotal = kubernetesConf.sparkConf
    +    .getOption(APP_RESOURCE_TYPE.key).map{ res =>
    +      val additionalPySparkMemory = if (res == "python") {
    --- End diff --
    
    Well, if there is a mixed pipeline, the binding steps would need to be reconfigured to include mixed pipelines. But currently, language is only determined based on the `appResource`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214404764
  
    --- Diff: examples/src/main/python/worker_memory_check.py ---
    @@ -0,0 +1,47 @@
    +#
    --- End diff --
    
    Is examples the right place for this?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.mem...

Posted by ifilonenko <gi...@git.apache.org>.
Github user ifilonenko commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22298#discussion_r214243652
  
    --- Diff: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/SecretsTestsSuite.scala ---
    @@ -53,6 +53,7 @@ private[spark] trait SecretsTestsSuite { k8sSuite: KubernetesSuite =>
           .delete()
       }
     
    +  // TODO: [SPARK-25291] This test is flaky with regards to memory of executors
    --- End diff --
    
    @mccheah This test periodically fails on setting proper memory for executors on this specific test. I have filed a JIRA: SPARK-25291


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org