You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by witgo <gi...@git.apache.org> on 2014/03/19 15:38:54 UTC

[GitHub] spark pull request: Fix Stage.name return "apply at Option.scala:1...

GitHub user witgo opened a pull request:

    https://github.com/apache/spark/pull/180

    Fix Stage.name return "apply at Option.scala:120"

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/witgo/spark SPARK-1280

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/180.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #180
    
----
commit a0dc74d05160061c45002386dffe0abab2d5f3dc
Author: witgo <wi...@qq.com>
Date:   2014-03-19T14:37:53Z

    Fix Stage.name return "apply at Option.scala:120"

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by aarondav <gi...@git.apache.org>.
Github user aarondav commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38534642
  
    I just submitted https://github.com/apache/spark/pull/222 which adds onto this PR with the solution Patrick mentioned. Let me know what you think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38311357
  
    Merged build finished.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by aarondav <gi...@git.apache.org>.
Github user aarondav commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38481065
  
    Interesting, take(1) does not create a stage for me since it completes locally (so it doesn't show up in the UI). I tried running it manually with runJob and allowLocal = false, but the issue did not show up, nor when I tried a collect.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by aarondav <gi...@git.apache.org>.
Github user aarondav commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38418878
  
    Thanks for this patch. Would you mind providing an example stack trace where this helped? I want to get a better sense of the issue to see if this is specific to Option or part of a more sinister problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix Stage.name return "apply at Option.scala:1...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38060981
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by aarondav <gi...@git.apache.org>.
Github user aarondav commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38481594
  
    Ah, I see, when I ran the collect it was on an old branch, and I didn't re-run it after updating. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by witgo <gi...@git.apache.org>.
Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38524035
  
    I think  the stack trace looks like: 
    
    the call in UI  [StageTable.scala#L79-79](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ui/jobs/StageTable.scala#L79-79)
    
    Create StageInfo 
    [DAGScheduler.scala#L242-242]
    (https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L242-242) => [StageInfo.scala#L39-39](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/StageInfo.scala#L39-39)=>[Stage.scala#L103-103](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/Stage.scala#L103-103) => [RDD.scala#L1044-1044](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L1044-1044)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by aarondav <gi...@git.apache.org>.
Github user aarondav commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38301708
  
    Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38311358
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13323/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by witgo <gi...@git.apache.org>.
Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38422713
  
    	val pairs = sc.parallelize(Array((1, 1), (1, 2), (1, 3), (2, 1)))
    	pairs.take(1)
    
    http://host:4040/stages/
    Completed Stages table Description Column   => "apply at Option.scala:120"
    
    Option.scala:120
    
        @inline final def getOrElse[B >: A](default: => B): B =
          if (isEmpty) default else this.get


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by witgo <gi...@git.apache.org>.
Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38417874
  
    Who can merge the improvement for web UI?
    @aarondav 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38481293
  
    Any other operation should work. I have been able to reproduce this with just a simple parallelize followed by a collect.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38502892
  
    Ya - sorry my bad. I think a good fix here is to just do this:
    ```
    val defaultCallSite = Utils.formatCallSiteInfo()
    Option(getLocalProperty("externalCallSite")).getOrElse(defaultCallSite)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by aarondav <gi...@git.apache.org>.
Github user aarondav commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38491639
  
    Here's what the stack trace looks like:
    ```
    java.lang.RuntimeException
    	at org.apache.spark.util.Utils$.getCallSiteInfo(Utils.scala:687)
    	at org.apache.spark.util.Utils$.formatCallSiteInfo$default$1(Utils.scala:723)
    	at org.apache.spark.SparkContext$$anonfun$getCallSite$1.apply(SparkContext.scala:880)
    	at org.apache.spark.SparkContext$$anonfun$getCallSite$1.apply(SparkContext.scala:880)
    	at scala.Option.getOrElse(Option.scala:120)
    	at org.apache.spark.SparkContext.getCallSite(SparkContext.scala:880)
    	at org.apache.spark.SparkContext.runJob(SparkContext.scala:898)
    	at org.apache.spark.SparkContext.runJob(SparkContext.scala:920)
    	at org.apache.spark.SparkContext.runJob(SparkContext.scala:934)
    	at org.apache.spark.SparkContext.runJob(SparkContext.scala:948)
    	at org.apache.spark.rdd.RDD.collect(RDD.scala:657)
            ...
    	at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:981)
    	at org.apache.spark.repl.Main$.main(Main.scala:31)
    	at org.apache.spark.repl.Main.main(Main.scala)
    ```
    
    It appears this was accidentally introduced when [SparkContext:880](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L880) was changed to use the Option pattern. I've talked with Patrick (who made that change) and we came to the conclusion that the better solution for now would be to pull out the getCallSiteInfo from the Option and pass it into formatCallSiteInfo.
    
    Perhaps we could also add a unit test to make sure this doesn't happen in the future if someone uses some Scala operation on the code path leading up to the getCallSiteInfo.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix Stage.name return "apply at Option.scala:1...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38083772
  
    Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by witgo <gi...@git.apache.org>.
Github user witgo closed the pull request at:

    https://github.com/apache/spark/pull/180


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38305324
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Fix SPARK-1280: Stage.name return "apply at Op...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/180#issuecomment-38305323
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---