You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by willb <gi...@git.apache.org> on 2014/05/09 23:18:10 UTC

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

GitHub user willb opened a pull request:

    https://github.com/apache/spark/pull/717

    SPARK-571: forbid return statements in cleaned closures

    This patch checks top-level closure arguments to `ClosureCleaner.clean` for `return` statements and raises an exception if it finds any.  This is mainly a user-friendliness addition, since programs with return statements in closure arguments will currently fail upon RDD actions with a less-than-intuitive error message.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/willb/spark spark-571

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/717.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #717
    
----
commit b017c4779a0bfc54f215bc1135d9d23ee9761c20
Author: William Benton <wi...@redhat.com>
Date:   2014-05-09T20:05:49Z

    Added a test for SPARK-571

commit 295b6a5960fd9a4509e64b225b0e4b479704978b
Author: William Benton <wi...@redhat.com>
Date:   2014-05-09T21:14:37Z

    Forbid return statements in closure arguments.
    
    This commit ensures that ClosureCleaner.clean will identify
    toplevel return statements in closures and raise an exception
    if they are encountered.
    
    This is intended to fix SPARK-571.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42911882
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42913622
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/717#discussion_r12497542
  
    --- Diff: core/src/test/scala/org/apache/spark/util/ClosureCleanerSuite.scala ---
    @@ -21,6 +21,7 @@ import org.scalatest.FunSuite
     
     import org.apache.spark.LocalSparkContext._
     import org.apache.spark.SparkContext
    +import org.apache.spark.SparkException
    --- End diff --
    
    nit: group this with line above (as in the other file)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-43003979
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14946/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42715751
  
    Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42911361
  
    I left couple comments on style. Thanks for leading this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42999578
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/717#discussion_r12497536
  
    --- Diff: core/src/test/scala/org/apache/spark/util/ClosureCleanerSuite.scala ---
    @@ -108,6 +117,20 @@ class TestClassWithoutFieldAccess {
       }
     }
     
    +object TestObjectWithBogusReturns {
    +  def badClosureWithReturn(v: org.apache.spark.rdd.RDD[Int]): Int = {
    --- End diff --
    
    nit: This function is pretty simple. Mind moving it into `run()` directly?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42730477
  
    @willb the re-partition issue is not related to your patch. There was an inadvertent check-in in master that caused it. It has been fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/717


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42911307
  
    Ah never mind my quesiton. "return" exits the outer level function instead of the closure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-43010073
  
    BTW I am merging this only into master (not branch-1.0) since it is too late for changes into Spark core for 1.0 unless they are critical bug fixes. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42719784
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14859/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42717201
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by willb <gi...@git.apache.org>.
Github user willb commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-43009698
  
    Thanks, @rxin!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42715752
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14858/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42910436
  
    Why do closures with return fail? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42719783
  
    Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42760036
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42715653
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/717#discussion_r12564569
  
    --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala ---
    @@ -181,6 +184,26 @@ private[spark] object ClosureCleaner extends Logging {
     }
     
     private[spark]
    +class ReturnStatementFinder extends ClassVisitor(ASM4) {
    +  override def visitMethod(access: Int, name: String, desc: String,
    +      sig: String, exceptions: Array[String]): MethodVisitor = {
    +    if(name.contains("apply")) {
    --- End diff --
    
    add a space after if (also for the if below)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42717217
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by willb <gi...@git.apache.org>.
Github user willb commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42760025
  
    Thanks for the clarification, @pwendell.
    
    @andrewor14, I've grouped my imports and collapsed that function.  Thanks again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by willb <gi...@git.apache.org>.
Github user willb commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-43000971
  
    @rxin, the latest commit adds such a test.
    
    I'll note that the code I have only finds toplevel `return` statements, so it won't spuriously reject code like that example, but it also won't reject code like this, which is an invalid use of `return`:
    
    ```scala
    nums.map {x => 
      // this return is invalid since it will transfer control outside the closure
      val foo = {y: Int => return 2; 1 }
      foo(x)
    }
    ```
    
    I thought identifying toplevel `return` statements in closures represented an acceptable tradeoff between complexity and user-friendliness in order to provide better feedback in the common case (but not necessarily to exhaustively flag all bogus closures, no matter how pathological).  If this isn't the right tradeoff, I could adapt the code to reject a higher percentage of closures with nonlocal returns.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-43003978
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42913623
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14932/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42760037
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42911723
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by willb <gi...@git.apache.org>.
Github user willb commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42730126
  
    Thanks for the quick feedback, @andrewor14!  I'll make those changes.  I'll also look in to see if I can figure out why that repartition test is failing; oddly, it's working for me locally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-43008847
  
    I think the current approach is fine. I'm going to merge this. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42999605
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42992463
  
    @willb can you also add a test case where an inline function is using return statement in its closure to make sure that is allowed? e.g.
    
    ```scala
    sc.parallelize(1 to 100).map { i =>
      def foo(): Int = { return 1; 2 }
      foo()
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/717#discussion_r12564572
  
    --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala ---
    @@ -181,6 +184,26 @@ private[spark] object ClosureCleaner extends Logging {
     }
     
     private[spark]
    +class ReturnStatementFinder extends ClassVisitor(ASM4) {
    +  override def visitMethod(access: Int, name: String, desc: String,
    +      sig: String, exceptions: Array[String]): MethodVisitor = {
    +    if(name.contains("apply")) {
    +      new MethodVisitor(ASM4) {
    +        override def visitTypeInsn(op: Int, tp: String) {
    +          if(op == NEW && tp.contains("scala/runtime/NonLocalReturnControl")) {
    +            throw new SparkException("Return statements aren't allowed in Spark closures")
    +          }
    +        }
    +      }
    +    } else {
    +      new MethodVisitor(ASM4) {}
    +    }
    +  }
    +}
    +
    --- End diff --
    
    just leave one line here for spacing


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by willb <gi...@git.apache.org>.
Github user willb commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42911366
  
    @rxin, yes!  (I was firing up a debugger in case you wanted to know exactly where it was returning to.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-571: forbid return statements in cleaned...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/717#issuecomment-42715644
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---