You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by bbejeck <gi...@git.apache.org> on 2014/09/01 13:57:06 UTC

[GitHub] spark pull request: [CORE] SPARK-3178 setting SPARK_WORKER_MEMORY ...

GitHub user bbejeck opened a pull request:

    https://github.com/apache/spark/pull/2227

    [CORE] SPARK-3178 setting SPARK_WORKER_MEMORY to a value without a label (m or g) sets the worker memory limit to zero

    Now the worker will fail fast if the memory is set to zero by leaving off the the label (m or g) either in the SPARK_WORKER_MEMORY environment variable or from the command line. 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/bbejeck/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2227.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2227
    
----
commit 8abf665afef9b76e9886e72e596e687bd7516475
Author: Bill Bejeck <bb...@gmail.com>
Date:   2014-09-01T02:54:29Z

    SPARK-3178 - Validate the memory is greater than zero when set from the SPARK_WORKER_MEMORY environment variable or command line without a g or m label.  Added unit tests. If memory is 0 an IllegalStateException is thrown.

commit e9bcf2cf6fe5b3aa1649c80c82f8ff240662f413
Author: Bill Bejeck <bb...@gmail.com>
Date:   2014-09-01T02:57:56Z

    Merge remote-tracking branch 'upstream/master'
    Merging updates from Spark upstream on 8/31/14

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

Posted by bbejeck <gi...@git.apache.org>.
Github user bbejeck commented on the pull request:

    https://github.com/apache/spark/pull/2227#issuecomment-54637031
  
    Did any of the admin had chance to check it out? Let me know if you want me to modify anything in it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2227#issuecomment-54852535
  
    For those watching this PR, the new one is at #2309.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2227#issuecomment-54717377
  
    If you want to change the branch used for the pr you will need to close and open a new one
    
    Sent from my phone
    
    > On Sep 5, 2014, at 7:24 PM, Bill Bejeck <no...@github.com> wrote:
    > 
    > I've made the changes for mocking out the environment variable in the test. I'd like to move the code from forked master branch to a topic branch. If I rebase and move the code to a topic branch will the PR be automatically updated or will I have to close the PR, do the rebase and re-submit?
    > 
    > —
    > Reply to this email directly or view it on GitHub.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2227#issuecomment-54655129
  
    Feels to me like it would be better to fix this in `Utils.memoryStringToMb`. That way all code using it benefits.
    
    As for the behavior of that method, maybe it should throw an exception if there is no suffix and the value is < 1MB?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

Posted by bbejeck <gi...@git.apache.org>.
Github user bbejeck commented on the pull request:

    https://github.com/apache/spark/pull/2227#issuecomment-54699846
  
    I've made the changes for mocking out the environment variable in the test.  I'd like to move the code from forked master branch to a topic branch.  If I rebase and move the code to a topic branch will the PR be automatically updated or will I have to close the PR, do the rebase and re-submit?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

Posted by bbejeck <gi...@git.apache.org>.
Github user bbejeck commented on the pull request:

    https://github.com/apache/spark/pull/2227#issuecomment-54662763
  
    > Feels to me like it would be better to fix this in Utils.memoryStringToMb. That way all code using it benefits.
    
    I thought the same thing, but I was not sure about making a change that would be cross-cutting, so I confined my change to the WorkerArguments class


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

Posted by bbejeck <gi...@git.apache.org>.
Github user bbejeck closed the pull request at:

    https://github.com/apache/spark/pull/2227


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [CORE] SPARK-3178 setting SPARK_WORKER_MEMORY ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2227#issuecomment-54051575
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2227#discussion_r17185052
  
    --- Diff: core/src/test/scala/org/apache/spark/deploy/worker/WorkerArgumentsTest.scala ---
    @@ -0,0 +1,60 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +
    +package org.apache.spark.deploy.worker
    +
    +import org.apache.spark.SparkConf
    +import org.scalatest.FunSuite
    +
    +
    +class WorkerArgumentsTest extends FunSuite {
    +
    +  test("Memory can't be set to 0 when cmd line args leave off M or G") {
    +    val conf = new SparkConf
    +    val args = Array("-m", "10000", "spark://localhost:0000  ")
    +    intercept[IllegalStateException] {
    +      new WorkerArguments(args, conf)
    +    }
    +  }
    +
    +
    +/* For this test an environment property for SPARK_WORKER_MEMORY was set
    --- End diff --
    
    Oh, to be more specific: you'll have to change the code that reads the environment variable to use `SparkConf.getEnv` instead of `System.getEnv`; I only changed this for the environment variables used in my specific test because I didn't want to make a big cross-cutting change across the codebase (plus it would probably get broken by subsequent PRs; we should add a style checker rule that complains about System.getEnv uses if we plan on doing this change globally).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2227#issuecomment-54694421
  
    Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2227#discussion_r17184945
  
    --- Diff: core/src/test/scala/org/apache/spark/deploy/worker/WorkerArgumentsTest.scala ---
    @@ -0,0 +1,60 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +
    +package org.apache.spark.deploy.worker
    +
    +import org.apache.spark.SparkConf
    +import org.scalatest.FunSuite
    +
    +
    +class WorkerArgumentsTest extends FunSuite {
    +
    +  test("Memory can't be set to 0 when cmd line args leave off M or G") {
    +    val conf = new SparkConf
    +    val args = Array("-m", "10000", "spark://localhost:0000  ")
    +    intercept[IllegalStateException] {
    +      new WorkerArguments(args, conf)
    +    }
    +  }
    +
    +
    +/* For this test an environment property for SPARK_WORKER_MEMORY was set
    --- End diff --
    
    In #2002, I added a mechanism that allows environment variables to be mocked in tests.  Take a look at that PR, `SparkConf.getEnv` in particular.  By using a custom SparkConf subclass, you can mock environment variables on a per-test basis: https://github.com/apache/spark/pull/2002/files#diff-e9fb6be5f96766cce96c4d60aea2fc59R45
    
    If we find ourselves doing this in multiple places (my PR, here, ...) it might be nice to add some test helper classes for doing this more generically.  That refactoring can happen in a separate PR, though, so for now it's probably fine to just copy my code snippet here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2227#issuecomment-54654828
  
    Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2227#issuecomment-54694353
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

Posted by bbejeck <gi...@git.apache.org>.
Github user bbejeck commented on the pull request:

    https://github.com/apache/spark/pull/2227#issuecomment-54732756
  
    Closing PR to move to change branch, will open new PR


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

Posted by bbejeck <gi...@git.apache.org>.
Github user bbejeck commented on the pull request:

    https://github.com/apache/spark/pull/2227#issuecomment-54662304
  
    Josh, 
    
    Thanks for the heads up on testing with environment variables. I will look at the PR and make the required changes to the test.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org