You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by jyotiska <gi...@git.apache.org> on 2014/02/11 15:19:22 UTC
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
GitHub user jyotiska opened a pull request:
https://github.com/apache/incubator-spark/pull/581
Added extra description on ValueError when one Spark context already exists
I added extra description on the ValueError message, when more than one Spark context already exists. Added the current master name and the app name for easier understanding.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/incubator-spark master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-spark/pull/581.patch
----
commit 945e39a5d68daa7e5bab0d96cbd35d7c4b04eafb
Author: jyotiska <jy...@gmail.com>
Date: 2014-02-08T07:29:09Z
Added example python code for sort
commit 6f98f1e313f4472a7c2207d36c4f0fbcebc95a8c
Author: jyotiska <jy...@gmail.com>
Date: 2014-02-08T07:42:37Z
Updated python example code sort.py
commit 8ad8faf6c8e02ae1cd68565d98524edf165f54df
Author: jyotiska <jy...@gmail.com>
Date: 2014-02-09T05:30:41Z
Added comments in code on collect() method
commit 92e23fea707ed6de551dc8d5ffa9b4f987683628
Author: jyotiska <jy...@gmail.com>
Date: 2014-02-11T13:40:15Z
Added extra description on ValueError when one Spark context already running
commit d90bea59f15738a00e03c9761ea27157ad2ef04d
Author: jyotiska <jy...@gmail.com>
Date: 2014-02-11T13:55:12Z
Merge remote-tracking branch 'upstream/master'
----
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35123081
All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12721/
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35126548
@JoshRosen, can you review this and merge? This closes SPARK-1087 along with SPARK-972.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35120729
Wait, what happened to foreachPartition, keys and repartition methods? Did I just accidentally deleted them or have they been discontinued?
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35026000
Merged build triggered.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-34796972
I understand. Maybe we can use a global variable to store the call site (line number?) and print it back along with the address. What do you think?
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-34799232
You may be able to re-use the [`_extract_concise_traceback`](https://github.com/apache/incubator-spark/pull/311/files#diff-d6fe2792e44f6babc94aabfefc8b9bceR43) method in `rdd.py`.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
GitHub user jyotiska reopened a pull request:
https://github.com/apache/incubator-spark/pull/581
Added extra description on ValueError when one Spark context already exists
I added extra description on the ValueError message, when more than one Spark context already exists. Added the current master name and the app name for easier understanding.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/incubator-spark master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-spark/pull/581.patch
----
commit 945e39a5d68daa7e5bab0d96cbd35d7c4b04eafb
Author: jyotiska <jy...@gmail.com>
Date: 2014-02-08T07:29:09Z
Added example python code for sort
commit 6f98f1e313f4472a7c2207d36c4f0fbcebc95a8c
Author: jyotiska <jy...@gmail.com>
Date: 2014-02-08T07:42:37Z
Updated python example code sort.py
commit 8ad8faf6c8e02ae1cd68565d98524edf165f54df
Author: jyotiska <jy...@gmail.com>
Date: 2014-02-09T05:30:41Z
Added comments in code on collect() method
commit 92e23fea707ed6de551dc8d5ffa9b4f987683628
Author: jyotiska <jy...@gmail.com>
Date: 2014-02-11T13:40:15Z
Added extra description on ValueError when one Spark context already running
commit d90bea59f15738a00e03c9761ea27157ad2ef04d
Author: jyotiska <jy...@gmail.com>
Date: 2014-02-11T13:55:12Z
Merge remote-tracking branch 'upstream/master'
----
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35116880
One or more automated tests failed
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12720/
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35120691
Merged build triggered.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35124160
Merged build triggered.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35054785
Jenkins, test this please.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-34782466
This tries to solve [SPARK-972](https://spark-project.atlassian.net/browse/SPARK-972)
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35123080
Merged build finished.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-34758328
Can one of the admins verify this patch?
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35058781
@JoshRosen, I think when the traceback function doesn't return the callsite info, it simply returns "I'm lost!". Handled that case in this commit. Can you ask Jenkins to test this?
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-34804481
I have used the _extract_concise_traceback to print the callsite of existing SparkContext. I guess this is what you had in mind.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35116879
Merged build finished.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35056797
Merged build started.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-34850215
@JoshRosen
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35126312
All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12723/
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-34798126
Got it. I will work on it and report it back to you. I guess traceback module will be useful here.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35120692
Merged build started.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35227176
@JoshRosen any updates on this?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post your response.
If your project does not have this feature enabled and wishes so, or if the
feature is enabled but not working, please contact infrastructure at
infrastructure@apache.org or file a JIRA ticket with INFRA.
---
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-34795439
In [SPARK-972](https://spark-project.atlassian.net/browse/SPARK-972), my intent was to log the _call site_ of the original SparkContext construction to help figure out when/where it was created. Logging the address that the original SparkContext connected to is an improvement over what we have now, but it doesn't help to debug cases where some complex user-written initialization code resulted in the creation of multiple SparkContexts connected to the same master. [SPARK-991](https://spark-project.atlassian.net/browse/SPARK-991), which appears to have been resolved in #311, lays the groundwork for recording call sites.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35117763
I may have given some bad suggestions earlier with using `_extract_concise_traceback`; I think you may just want to use the regular traceback module and just use the attribute that drops the frame that calls the traceback (so you get the frame of the call into `SparkContext`__init__`).
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-34836413
I think for future improvements, it would be a good idea to move the traceback code to a separate file and use that to return callsite. Please start a JIRA ticket and assign it to me. I will submit a PR once I am done.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35114290
Merged build started.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35026001
Merged build started.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska closed the pull request at:
https://github.com/apache/incubator-spark/pull/581
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35028747
Merged build finished.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35117581
@JoshRosen, I think the problem is with file names with spaces in them. In that case, the line split is not working and returning the wrong value. Lets try moving the traceback method from rdd and putting in a separate file. Also, I will put a separate function for returning callsite info as dict.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-34961462
Can one of the admins test and merge this?
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35124161
Merged build started.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35028748
One or more automated tests failed
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12710/
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35126310
Merged build finished.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35114289
Merged build triggered.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35056776
Jenkins, test this please.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35056796
Merged build triggered.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35054777
I think this is good to go. Can I ask Jenkins to test this?
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35025658
Jenkins, test this please.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-34797494
I'd store the constructor's call site (filename + line number) as an instance variable inside SparkContext, since we already have the `SparkContext._active_spark_context` global to retrieve the running context.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35058208
Merged build finished.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by jyotiska <gi...@git.apache.org>.
Github user jyotiska commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35111896
Jenkins, test this please.
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35058209
One or more automated tests failed
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12712/
[GitHub] incubator-spark pull request: Added extra description on ValueErro...
Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:
https://github.com/apache/incubator-spark/pull/581#issuecomment-35114200
Jenkins, this is ok to test.