You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by brndnmtthws <gi...@git.apache.org> on 2014/09/15 23:36:50 UTC

[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

GitHub user brndnmtthws opened a pull request:

    https://github.com/apache/spark/pull/2401

    [SPARK-3535][Mesos] Add 15% task memory overhead.

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/brndnmtthws/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2401.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2401
    
----
commit 6532d47521ac4df918a563fc1946d69deb8fa94a
Author: Brenden Matthews <br...@diddyinc.com>
Date:   2014-09-15T21:34:42Z

    [SPARK-3535][Mesos] Add 15% task memory overhead.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/2401


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55921158
  
    When you need to provide guarantees for other services, it's better to stick to hard limits.  Having other tasks get randomly OOM killed is a bad experience.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55821735
  
    YARN has a similar strategy, from what I can tell.  There's a separate config value, `spark.yarn.executor.memoryOverhead`.
    
    We could go the other way, where it subtracts the overhead from the heap.  To me it's all the same.
    
    If someone is filling the whole OS with heap, they're still likely to OOM, but it may not happen as quickly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-56120329
  
    Updated as per @andrewor14's suggestions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-56120347
  
    Forgot to mention: I also set the executor CPUs correctly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57703146
  
    Updated to match YARN.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-56125354
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20564/consoleFull) for   PR 2401 at commit [`d51b74f`](https://github.com/apache/spark/commit/d51b74f26b67ff3719b7ba3182d0693e45301a81).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-56122505
  
    @willb in the case of Yarn, the parameter is called "overhead" because it actually sets the amount of overhead you want to add to the requested heap memory. The PR being referenced just uses a multiplier to set the default overhead value; the user-visible parameter doesn't set the multiplier.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57676227
  
    This code has been tested & verified on a cluster of this size:
    
    ![screen shot 2014-10-02 at 11 13 57 am](https://cloud.githubusercontent.com/assets/3129093/4495884/f3bba99e-4a5f-11e4-982b-838fd2836074.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2401#discussion_r18122508
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MemoryUtils.scala ---
    @@ -0,0 +1,34 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.scheduler.cluster.mesos
    +
    +import org.apache.spark.SparkContext
    +
    +object MemoryUtils {
    +  val OVERHEAD_FRACTION = 0.15
    +  val OVERHEAD_MINIMUM = 384
    +
    +  private[spark] def calculateTotalMemory(sc: SparkContext) = {
    +    Math.max(
    +      sc.conf.getOption("spark.mesos.executor.memoryOverhead")
    +        .getOrElse(OVERHEAD_FRACTION.toString)
    +        .toDouble * 1.0 * sc.executorMemory,
    --- End diff --
    
    Ah yes good catch.  I'll update the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57814725
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21249/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-56119651
  
    @brndnmtthws Yes it would be good to keep the approach used in #1391 in mind. It would be ideal if we could use an approach as similar as what we currently (or plan to) do for yarn as possible, including the naming and the semantics of the configs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57842699
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21258/consoleFull) for   PR 2401 at commit [`4abaa5d`](https://github.com/apache/spark/commit/4abaa5d442d0c4224f8455df397c660a9c09441f).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55821489
  
    Hey will this have compatbility issues for existing deployments? I know many clusters where they just have Spark request the entire amount of memory on the node. With this, if a user upgrades their jobs could just starve. What if instead we just "scale down" the size of the executor based on what the user requests. I.e. if they request 20GB executors we reserve a few GB for this overhead. @andrewor14 how does this work in YARN? It might be good to have similar semantics to what they have there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55662990
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by ash211 <gi...@git.apache.org>.
Github user ash211 commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57760828
  
    This looks very reasonable.  Counting the executor's bookkeeping core against the resources also seems much more correct than pretending it doesn't exist like before.
    
    Regarding the fraction-vs-absolute debate: if down the line users find setting the overhead with absolute numbers to be suboptimal, we can always add a new parameter named `spark.mesos.executor.memoryOverheadPercent` that defaults to 7 and can be changed to 15 or higher as needed.
    
    Considering Brenden's test on the cluster and experience with Mesos this seems good to merge pending extremely minor typos


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55827491
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20427/consoleFull) for   PR 2401 at commit [`8cbde19`](https://github.com/apache/spark/commit/8cbde19ece318fbd93bc7e729eb3ce1235377991).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57015165
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20869/consoleFull) for   PR 2401 at commit [`908d90a`](https://github.com/apache/spark/commit/908d90aeda8592e464e602141ff53f56accde6aa).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2401#discussion_r17762547
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CommonProps.scala ---
    @@ -0,0 +1,27 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.scheduler.cluster.mesos
    +
    +import org.apache.spark.SparkContext
    +
    +object CommonProps {
    +  def memoryOverheadFraction(sc: SparkContext) =
    --- End diff --
    
    Should we just make this `getExecutorMemory` or something? Then this can factor in the overhead part under the scene.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57822162
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-56123833
  
    I can emulate the YARN behaviour, but it seems better to just do the same thing with both Mesos and YARN.  Thoughts?  I can refactor this (including the YARN code) to make it consistent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by willb <gi...@git.apache.org>.
Github user willb commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55902476
  
    Here's a somewhat-related concern:  it seems like JVM overhead is unlikely to scale linearly with requested heap size, so maybe a straight-up 15% isn't a great default?  (If you have hard data on how heap requirements grow with job size, I'd be interested in seeing it.)  Perhaps it would make more sense to reserve whichever is smaller of 15% or some fixed but reasonable amount.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55814086
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57015171
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20869/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57697590
  
    I guess it depends on what "--executor-memory" means on Mesos. On Yarn, it means the "-Xmx" value for the JVM. So it doesn't include the memory used by any python processes included in the process group. So `--executor-memory + overheadMemory` should be able to cover the JVM heap + JVM overhead + RSS of all the python processes. At that point, a fraction doesn't make sense, because, as I mentioned above, the memory of the python processes is unrelated to the JVM heap size.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57699365
  
    The Yarn patch sets a default value based on the JVM heap. That works well for a set of workloads (basically, Java/Scala apps). But the setting itself is an absolute value; so if you have a pyspark app, or you have a Java/Scala app that for some reason uses gobs of out-of-heap memory, you just set the absolute value, you don't need to do funny math.
    
    In your patch the setting itself is a fraction (from the messages so far - sorry I haven't read the latest version of the code). So if you set the executor memory to "4g", and your app for some reason uses 30g of out-of-heap memory, you have to set the multiplier to like 8 or something, instead of just saying "I need 30g of out-of-heap memory". You're forcing a relationship between two values that are not necessarily related to each other.
    
    I think the absolute value approach used by Yarn is more user friendly. But I'm not trying to convince you, I'm just explaining why Yarn does it differently and why I think it makes better sense. 
    
    Note #2485 is not *adding* the setting; the setting already exists. It's just calculating a sane(r) value for the setting based on other settings. But the setting still works the same as it did before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-56120575
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20564/consoleFull) for   PR 2401 at commit [`d51b74f`](https://github.com/apache/spark/commit/d51b74f26b67ff3719b7ba3182d0693e45301a81).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57831156
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21256/Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55827647
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20428/consoleFull) for   PR 2401 at commit [`298ca44`](https://github.com/apache/spark/commit/298ca445f77d16f969bfdfc6ea46842e11efa7db).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57001557
  
    That code is still YARN specific.  Shouldn't we have common code for this?
    
    Also, I disagree on the 7% overhead.  I think 15% is a better default.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by erikerlandson <gi...@git.apache.org>.
Github user erikerlandson commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55920577
  
    Applying soft memory limits via cgroups is one option, for environments supporting cgroups.  Reserve a certain amount of memory, and if it is exceeded then swapping starts to happen instead of hard failure


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57697287
  
    Mesos will indeed kill your contains as well (provided cgroup limits are enabled).  I also don't see how this would necessarily apply differently for Python, since it applies to the executor memory (and not strictly the heap size).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-56216904
  
    I thought there was some desire to have the same thing also #1391?
    
    Furthermore, from my experience writing frameworks, I think a much better model is the fractional overhead (relative to the heap size), for the reasons I mentioned above.  If you do some internet searching, you'll see that I've been doing quite a bit of this for a while.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55820092
  
    Updated diff based on coding.
    
    I've also added the same thing to the coarse scheduler, and updated the documentation accordingly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57045812
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20914/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by ash211 <gi...@git.apache.org>.
Github user ash211 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2401#discussion_r18122496
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MemoryUtils.scala ---
    @@ -0,0 +1,34 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.scheduler.cluster.mesos
    +
    +import org.apache.spark.SparkContext
    +
    +object MemoryUtils {
    +  val OVERHEAD_FRACTION = 0.15
    +  val OVERHEAD_MINIMUM = 384
    +
    +  private[spark] def calculateTotalMemory(sc: SparkContext) = {
    +    Math.max(
    +      sc.conf.getOption("spark.mesos.executor.memoryOverhead")
    +        .getOrElse(OVERHEAD_FRACTION.toString)
    +        .toDouble * 1.0 * sc.executorMemory,
    --- End diff --
    
    Should this be + 1.0 instead of * 1.0?  As written the *1.0 is essentially a no-op


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2401#discussion_r17762446
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CommonProps.scala ---
    @@ -0,0 +1,27 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.scheduler.cluster.mesos
    +
    +import org.apache.spark.SparkContext
    +
    +object CommonProps {
    +  def memoryOverheadFraction(sc: SparkContext) =
    +    sc.conf.getOption("spark.executor.memory.overhead.fraction")
    +      .getOrElse("1.15")
    --- End diff --
    
    This isn't actually a fraction. Perhaps what you meant is `multiplier` or something.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2401#discussion_r17763516
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CommonProps.scala ---
    @@ -0,0 +1,27 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.scheduler.cluster.mesos
    +
    +import org.apache.spark.SparkContext
    +
    +object CommonProps {
    +  def memoryOverheadFraction(sc: SparkContext) =
    +    sc.conf.getOption("spark.executor.memory.overhead.fraction")
    +      .getOrElse("1.15")
    --- End diff --
    
    I was trying to keep the code looking clean, but I'll change `fraction` to `multiplier` in the method, if that seems more sensible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57842713
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21258/Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57833282
  
    Done as per @andrewor14's suggestions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57700183
  
    It's counterintuitive to the policy used elsewhere, but if it will appease the Spark folks, I will make the change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55903894
  
    Oh, and one more thing you may want to think about is the OS filesystem buffers.  Again, as you scale up the heap, you may want to proportionally reserve a slice of off-heap memory just for the OS.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-56207061
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20574/consoleFull) for   PR 2401 at commit [`56988e3`](https://github.com/apache/spark/commit/56988e31363bc07dc8acb369bdaade6b18b98f51).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by willb <gi...@git.apache.org>.
Github user willb commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-56121260
  
    @andrewor14 This bothers me too, but in a slightly different way:  calling the parameter “overhead” when it really refers to how to scale requested memory to accommodate anticipated overhead seems wrong.  (115% overhead would be appalling indeed!)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57696587
  
    As expressed elsewhere, I think the model that the #2485 has doesn't make sense.  The most important knob is the overhead fraction, rather than the minimum number of bytes.  This comes from having a lot of operational experience with Mesos frameworks, especially those running things on the JVM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57807681
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21249/consoleFull) for   PR 2401 at commit [`54a4a06`](https://github.com/apache/spark/commit/54a4a069f2a981b5691cff4cf76a35d43202e872).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-56123690
  
    Yes @willb I have the same concern. I think in other contexts "overhead" refers to the additional space, not the total space, so an "overhead fraction" of 0.15 means we use extra space that is equal to 15% of the original memory. It will be confusing if we expose a "fraction" config that has a value of 1.15.
    
    However, even with the latest changes this is still inconsistent with what we do for yarn. I think there is value in having an equivalent config for Mesos rather than one that is semantically different.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57814884
  
    That test failure appears to be unrelated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-56216747
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20574/consoleFull) for   PR 2401 at commit [`56988e3`](https://github.com/apache/spark/commit/56988e31363bc07dc8acb369bdaade6b18b98f51).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `sealed trait Matrix extends Serializable `
      * `class SparseMatrix(`
      * `sealed trait Vector extends Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57698498
  
    The definition is indeed the same, but I don't see how the YARN patch solves that better than this one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57001147
  
    Hey @brndnmtthws have you looked at the latest changes in #2485? I think that one's basically ready to be merged, and the discussion there is that we want to avoid exposing the fraction config (and keep the absolute one) so we don't expose too many configs to the user. Would you mind updating this PR to reflect the behavior for Yarn in Mesos? We don't have to use the same 7% as suggested there, but it would be good to make a comment on why a particular value for this fraction is used.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55820562
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20427/consoleFull) for   PR 2401 at commit [`8cbde19`](https://github.com/apache/spark/commit/8cbde19ece318fbd93bc7e729eb3ce1235377991).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57830054
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21254/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57032214
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20883/consoleFull) for   PR 2401 at commit [`b391649`](https://github.com/apache/spark/commit/b3916499ee22ea0341c6981ab1c71cd24c8e4e17).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57032218
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20883/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57826312
  
    Hey @brndnmtthws I left a few more minor comments but I think this is basically ready for merge. Thanks for your changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57717079
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21221/consoleFull) for   PR 2401 at commit [`747b490`](https://github.com/apache/spark/commit/747b490f958bd397e950a7df1646e416b3154335).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57694715
  
    @brndnmtthws Thanks for adapting this to what Yarn does. One concern I have now though is that `spark.yarn.executor.memoryOverhead` and `spark.mesos.executor.memoryOverhead` mean different things. The former refers to the raw memory in MB (384 MB), but the latter refers to a fraction of the executor memory (15%). I think this will be confusing to the user.
    
    Any idea on what we should do here @pwendell @tgravescs?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57008399
  
    I've updated the PR to match closer to what #2485 does.
    
    I'd like to keep the fractional param.  Having successfully operated various services on Mesos in production, I think this is the most sensible way to do it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57008686
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20869/consoleFull) for   PR 2401 at commit [`908d90a`](https://github.com/apache/spark/commit/908d90aeda8592e464e602141ff53f56accde6aa).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57026573
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20883/consoleFull) for   PR 2401 at commit [`b391649`](https://github.com/apache/spark/commit/b3916499ee22ea0341c6981ab1c71cd24c8e4e17).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55814658
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20421/consoleFull) for   PR 2401 at commit [`6532d47`](https://github.com/apache/spark/commit/6532d47521ac4df918a563fc1946d69deb8fa94a).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57833806
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21258/consoleFull) for   PR 2401 at commit [`4abaa5d`](https://github.com/apache/spark/commit/4abaa5d442d0c4224f8455df397c660a9c09441f).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55903580
  
    That implies that as you grow the heap, you're not adding threads (or other things that use off-heap memory).  I'm not familiar with Spark's execution model, but I would assume that as you scale up, the workers will need more threads to serve cache requests and what have you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57023278
  
    Build error appears to be unrelated to my patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by ash211 <gi...@git.apache.org>.
Github user ash211 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2401#discussion_r18382851
  
    --- Diff: docs/configuration.md ---
    @@ -253,6 +253,17 @@ Apart from these, the following properties are also available, and may be useful
         <code>spark.executor.uri</code>.
       </td>
     </tr>
    +<tr>
    +  <td><code>spark.mesos.executor.memoryOverhead</code></td>
    +  <td>384</td>
    +  <td>
    +    This value is an additive for <code>spark.executor.memory</code>, specified in MiB,
    +    which is used to calculate the total Mesos task memory. A value of <code>348</code>
    --- End diff --
    
    Typo in the docs here: 348 -> 384.  And also below, additionaly -> additionally


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2401#discussion_r17763087
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CommonProps.scala ---
    @@ -0,0 +1,27 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.scheduler.cluster.mesos
    +
    +import org.apache.spark.SparkContext
    +
    +object CommonProps {
    +  def memoryOverheadFraction(sc: SparkContext) =
    +    sc.conf.getOption("spark.executor.memory.overhead.fraction")
    +      .getOrElse("1.15")
    --- End diff --
    
    Because it's >0? It could default to 0.15, if that's the semantics you prefer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55821123
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20428/consoleFull) for   PR 2401 at commit [`298ca44`](https://github.com/apache/spark/commit/298ca445f77d16f969bfdfc6ea46842e11efa7db).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57703942
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21221/consoleFull) for   PR 2401 at commit [`747b490`](https://github.com/apache/spark/commit/747b490f958bd397e950a7df1646e416b3154335).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57717093
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21221/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57697092
  
    A fraction wouldn't work well for pyspark on Yarn, because the memory used by the python processes is not a fraction of the JVM heap. I don't know about Mesos, but Yarn will kill your containers if they go over the allocated memory, so that's an actual issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2401#discussion_r17763285
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CommonProps.scala ---
    @@ -0,0 +1,27 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.scheduler.cluster.mesos
    +
    +import org.apache.spark.SparkContext
    +
    +object CommonProps {
    +  def memoryOverheadFraction(sc: SparkContext) =
    +    sc.conf.getOption("spark.executor.memory.overhead.fraction")
    +      .getOrElse("1.15")
    --- End diff --
    
    In my mind a fraction is a value between 0 and 1, but 1.15 is not in this range.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57814712
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21249/consoleFull) for   PR 2401 at commit [`54a4a06`](https://github.com/apache/spark/commit/54a4a069f2a981b5691cff4cf76a35d43202e872).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55814769
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20421/consoleFull) for   PR 2401 at commit [`6532d47`](https://github.com/apache/spark/commit/6532d47521ac4df918a563fc1946d69deb8fa94a).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57847794
  
    Ok, LGTM I'm merging this into master and 1.1. Thanks @brndnmtthws.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-55922361
  
    Might want to take a look at #1391 also; that adds a similar thing for Yarn (a multiplier for the JVM overhead), it might be worth to merge the two code paths (or add this to some utility class) so that everybody benefits.
    
    Another thing to maybe keep in mind is Patrick's comment in SPARK-1930 about pyspark requiring even more "overhead" to be assigned.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2401#discussion_r17762483
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CommonProps.scala ---
    @@ -0,0 +1,27 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.scheduler.cluster.mesos
    +
    +import org.apache.spark.SparkContext
    +
    +object CommonProps {
    --- End diff --
    
    This needs to be `private[spark]`. We shouldn't expect applications to be able to access this. Also, can we just call this `MesosUtils` or something to be more consistent with other classes we have? (e.g. `SparkHadoopUtils`, `KafkaUtils`)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57822909
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21254/consoleFull) for   PR 2401 at commit [`54a4a06`](https://github.com/apache/spark/commit/54a4a069f2a981b5691cff4cf76a35d43202e872).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57007834
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20868/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-56216271
  
    So, I'm a little disappointed that this doesn't at least follow the Yarn model of "one setting that defines the overhead". Instead, it has two settings, one for a fraction and one to define some minimum if the fraction is somehow less than that. That sounds too complicated.
    
    What's the argument against Yarn's model of a single setting with an absolute overhead value? That doesn't require the user to do math, and makes things easier when for some reason the user requires lots of overhead (e.g. large usage of off-heap memory) that is not necessarily related to the heap size.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Fix resource handling.

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-57806924
  
    Fixed typo's (also switch from `Math.max` to `math.max` because that seems to be the Scala way).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

Posted by brndnmtthws <gi...@git.apache.org>.
Github user brndnmtthws commented on the pull request:

    https://github.com/apache/spark/pull/2401#issuecomment-56126568
  
    I've cleaned up the patch again.
    
    I spent about an hour trying to apply this to the YARN code, but it was pretty difficult to follow so I gave up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org