You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by jyotiska <gi...@git.apache.org> on 2014/02/11 15:19:22 UTC

[GitHub] incubator-spark pull request: Added extra description on ValueErro...

GitHub user jyotiska opened a pull request:

    https://github.com/apache/incubator-spark/pull/581

    Added extra description on ValueError when one Spark context already exists

    I added extra description on the ValueError message, when more than one Spark context already exists. Added the current master name and the app name for easier understanding.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-spark/pull/581.patch

----
commit 945e39a5d68daa7e5bab0d96cbd35d7c4b04eafb
Author: jyotiska <jy...@gmail.com>
Date:   2014-02-08T07:29:09Z

    Added example python code for sort

commit 6f98f1e313f4472a7c2207d36c4f0fbcebc95a8c
Author: jyotiska <jy...@gmail.com>
Date:   2014-02-08T07:42:37Z

    Updated python example code sort.py

commit 8ad8faf6c8e02ae1cd68565d98524edf165f54df
Author: jyotiska <jy...@gmail.com>
Date:   2014-02-09T05:30:41Z

    Added comments in code on collect() method

commit 92e23fea707ed6de551dc8d5ffa9b4f987683628
Author: jyotiska <jy...@gmail.com>
Date:   2014-02-11T13:40:15Z

    Added extra description on ValueError when one Spark context already running

commit d90bea59f15738a00e03c9761ea27157ad2ef04d
Author: jyotiska <jy...@gmail.com>
Date:   2014-02-11T13:55:12Z

    Merge remote-tracking branch 'upstream/master'

----


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35123081
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12721/


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35126548
  
    @JoshRosen, can you review this and merge? This closes SPARK-1087 along with SPARK-972.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35120729
  
    Wait, what happened to foreachPartition, keys and repartition methods? Did I just accidentally deleted them or have they been discontinued?


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35026000
  
     Merged build triggered.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-34796972
  
    I understand. Maybe we can use a global variable to store the call site (line number?) and print it back along with the address. What do you think?


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-34799232
  
    You may be able to re-use the [`_extract_concise_traceback`](https://github.com/apache/incubator-spark/pull/311/files#diff-d6fe2792e44f6babc94aabfefc8b9bceR43) method in `rdd.py`.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
GitHub user jyotiska reopened a pull request:

    https://github.com/apache/incubator-spark/pull/581

    Added extra description on ValueError when one Spark context already exists

    I added extra description on the ValueError message, when more than one Spark context already exists. Added the current master name and the app name for easier understanding.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-spark/pull/581.patch

----
commit 945e39a5d68daa7e5bab0d96cbd35d7c4b04eafb
Author: jyotiska <jy...@gmail.com>
Date:   2014-02-08T07:29:09Z

    Added example python code for sort

commit 6f98f1e313f4472a7c2207d36c4f0fbcebc95a8c
Author: jyotiska <jy...@gmail.com>
Date:   2014-02-08T07:42:37Z

    Updated python example code sort.py

commit 8ad8faf6c8e02ae1cd68565d98524edf165f54df
Author: jyotiska <jy...@gmail.com>
Date:   2014-02-09T05:30:41Z

    Added comments in code on collect() method

commit 92e23fea707ed6de551dc8d5ffa9b4f987683628
Author: jyotiska <jy...@gmail.com>
Date:   2014-02-11T13:40:15Z

    Added extra description on ValueError when one Spark context already running

commit d90bea59f15738a00e03c9761ea27157ad2ef04d
Author: jyotiska <jy...@gmail.com>
Date:   2014-02-11T13:55:12Z

    Merge remote-tracking branch 'upstream/master'

----


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35116880
  
    One or more automated tests failed
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12720/


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35120691
  
     Merged build triggered.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35124160
  
     Merged build triggered.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35054785
  
    Jenkins, test this please.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-34782466
  
    This tries to solve [SPARK-972](https://spark-project.atlassian.net/browse/SPARK-972)


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35123080
  
    Merged build finished.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-34758328
  
    Can one of the admins verify this patch?


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35058781
  
    @JoshRosen, I think when the traceback function doesn't return the callsite info, it simply returns "I'm lost!". Handled that case in this commit. Can you ask Jenkins to test this?


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-34804481
  
    I have used the _extract_concise_traceback to print the callsite of existing SparkContext. I guess this is what you had in mind.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35116879
  
    Merged build finished.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35056797
  
    Merged build started.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-34850215
  
    @JoshRosen 


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35126312
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12723/


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-34798126
  
    Got it. I will work on it and report it back to you. I guess traceback module will be useful here.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35120692
  
    Merged build started.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35227176
  
    @JoshRosen any updates on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post your response.
If your project does not have this feature enabled and wishes so, or if the
feature is enabled but not working, please contact infrastructure at
infrastructure@apache.org or file a JIRA ticket with INFRA.
---

[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-34795439
  
    In [SPARK-972](https://spark-project.atlassian.net/browse/SPARK-972), my intent was to log the _call site_ of the original SparkContext construction to help figure out when/where it was created.  Logging the address that the original SparkContext connected to is an improvement over what we have now, but it doesn't help to debug cases where some complex user-written initialization code resulted in the creation of multiple SparkContexts connected to the same master.  [SPARK-991](https://spark-project.atlassian.net/browse/SPARK-991), which appears to have been resolved in #311, lays the groundwork for recording call sites.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35117763
  
    I may have given some bad suggestions earlier with using `_extract_concise_traceback`; I think you may just want to use the regular traceback module and just use the attribute that drops the frame that calls the traceback (so you get the frame of the call into `SparkContext`__init__`).


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-34836413
  
    I think for future improvements, it would be a good idea to move the traceback code to a separate file and use that to return callsite. Please start a JIRA ticket and assign it to me. I will submit a PR once I am done.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35114290
  
    Merged build started.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35026001
  
    Merged build started.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska closed the pull request at:

    https://github.com/apache/incubator-spark/pull/581


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35028747
  
    Merged build finished.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35117581
  
    @JoshRosen, I think the problem is with file names with spaces in them. In that case, the line split is not working and returning the wrong value. Lets try moving the traceback method from rdd and putting in a separate file. Also, I will put a separate function for returning callsite info as dict.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-34961462
  
    Can one of the admins test and merge this?


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35124161
  
    Merged build started.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35028748
  
    One or more automated tests failed
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12710/


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35126310
  
    Merged build finished.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35114289
  
     Merged build triggered.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35056776
  
    Jenkins, test this please.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35056796
  
     Merged build triggered.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35054777
  
    I think this is good to go. Can I ask Jenkins to test this?


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35025658
  
    Jenkins, test this please.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-34797494
  
    I'd store the constructor's call site (filename + line number) as an instance variable inside SparkContext, since we already have the `SparkContext._active_spark_context` global to retrieve the running context.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35058208
  
    Merged build finished.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35111896
  
    Jenkins, test this please.


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35058209
  
    One or more automated tests failed
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12712/


[GitHub] incubator-spark pull request: Added extra description on ValueErro...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/incubator-spark/pull/581#issuecomment-35114200
  
    Jenkins, this is ok to test.