You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by andrewor14 <gi...@git.apache.org> on 2014/08/21 00:14:54 UTC

[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

GitHub user andrewor14 opened a pull request:

    https://github.com/apache/spark/pull/2067

    [SPARK-3140] Clarify confusing PySpark exception message

    We read the py4j port from the stdout of the `bin/spark-submit` subprocess. If there is interference in stdout (e.g. a random echo in `spark-submit`), we throw an exception with a warning message. We do not, however, distinguish between this case from the case where no stdout is produced at all.
    
    I wasted a non-trivial amount of time being baffled by this exception in search of places where I print random whitespace (in vain, of course). A clearer exception message that distinguishes between these cases will prevent future headaches that I have gone through.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/andrewor14/spark python-exception

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2067.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2067
    
----
commit e96a7a0016502820f325c39d71b236b2b39e0cb6
Author: Andrew Or <an...@gmail.com>
Date:   2014-08-20T22:09:33Z

    Distinguish between unexpected output and no output at all

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2067#issuecomment-52852691
  
    @kayousterhout @JoshRosen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2067#issuecomment-52861567
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19008/consoleFull) for   PR 2067 at commit [`742f823`](https://github.com/apache/spark/commit/742f82311b1733f735fb5c2369aea7d1ca1d8774).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `  shift # Ignore main class (org.apache.spark.deploy.SparkSubmit) and use our own`
      * `case class SparkListenerTaskStart(stageId: Int, stageAttemptId: Int, taskInfo: TaskInfo)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2067#discussion_r16511511
  
    --- Diff: python/pyspark/java_gateway.py ---
    @@ -54,12 +54,18 @@ def preexec_func():
                 gateway_port = proc.stdout.readline()
                 gateway_port = int(gateway_port)
             except ValueError:
    +            # Grab the remaining lines of stdout
                 (stdout, _) = proc.communicate()
                 exit_code = proc.poll()
                 error_msg = "Launching GatewayServer failed"
                 error_msg += " with exit code %d! " % exit_code if exit_code else "! "
    --- End diff --
    
    Yeah, I misread it too the first time after reading your comment :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2067#discussion_r16511219
  
    --- Diff: python/pyspark/java_gateway.py ---
    @@ -54,12 +54,18 @@ def preexec_func():
                 gateway_port = proc.stdout.readline()
                 gateway_port = int(gateway_port)
             except ValueError:
    +            # Grab the remaining lines of stdout
                 (stdout, _) = proc.communicate()
                 exit_code = proc.poll()
                 error_msg = "Launching GatewayServer failed"
                 error_msg += " with exit code %d! " % exit_code if exit_code else "! "
    --- End diff --
    
    wait, when there's no exit code it looks like this, no?
    ```
    Launching GatewayServer failed! (Warning: ...)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2067#issuecomment-52857751
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19008/consoleFull) for   PR 2067 at commit [`742f823`](https://github.com/apache/spark/commit/742f82311b1733f735fb5c2369aea7d1ca1d8774).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by kayousterhout <gi...@git.apache.org>.
Github user kayousterhout commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2067#discussion_r16511443
  
    --- Diff: python/pyspark/java_gateway.py ---
    @@ -54,12 +54,18 @@ def preexec_func():
                 gateway_port = proc.stdout.readline()
                 gateway_port = int(gateway_port)
             except ValueError:
    +            # Grab the remaining lines of stdout
                 (stdout, _) = proc.communicate()
                 exit_code = proc.poll()
                 error_msg = "Launching GatewayServer failed"
                 error_msg += " with exit code %d! " % exit_code if exit_code else "! "
    --- End diff --
    
    OH sorry misread this -- thought the "if" was for what to substitute. NVM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by kayousterhout <gi...@git.apache.org>.
Github user kayousterhout commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2067#discussion_r16509829
  
    --- Diff: python/pyspark/java_gateway.py ---
    @@ -54,12 +54,18 @@ def preexec_func():
                 gateway_port = proc.stdout.readline()
                 gateway_port = int(gateway_port)
             except ValueError:
    +            # Grab the remaining lines of stdout
                 (stdout, _) = proc.communicate()
                 exit_code = proc.poll()
                 error_msg = "Launching GatewayServer failed"
                 error_msg += " with exit code %d! " % exit_code if exit_code else "! "
    -            error_msg += "(Warning: unexpected output detected.)\n\n"
    -            error_msg += gateway_port + stdout
    +            if gateway_port == "" and stdout == "":
    +                error_msg += "(Warning: no output detected.)\n"
    +            else:
    +                error_msg += "(Warning: unexpected output detected.)\n\n"
    +                error_msg += "--------------------------------------------------------------\n"
    +                error_msg += gateway_port + stdout
    --- End diff --
    
    Should this be more descriptive? Like "Expected GatewayServer to output a port; found: <actual>" (otherwise I wonder if printing gateway_port + stdout will seem cryptic to people)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by kayousterhout <gi...@git.apache.org>.
Github user kayousterhout commented on the pull request:

    https://github.com/apache/spark/pull/2067#issuecomment-52857293
  
    Cool this looks great!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2067#issuecomment-52857369
  
    This looks great.  Surprised Jenkins hasn't chimed in to test this, but I don't think it would (intentionally) exercise this code path anyways, so I think it's safe to merge this since you've tested it locally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by kayousterhout <gi...@git.apache.org>.
Github user kayousterhout commented on the pull request:

    https://github.com/apache/spark/pull/2067#issuecomment-52857443
  
    Given the number of accidental-build-breaks that have happened recently, I think it would be good to let Jenkins have at this unless there's a rush to get it in


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by kayousterhout <gi...@git.apache.org>.
Github user kayousterhout commented on the pull request:

    https://github.com/apache/spark/pull/2067#issuecomment-52853426
  
    This is great...there's been quite a lot of pain associated with these lines of code and I think this will help a lot.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2067#issuecomment-52857112
  
    This is what it looks like with extra output:
    ```
    Exception: Launching GatewayServer failed!
    Warning: Expected GatewayServer to output a port, but found the following output:
    
    --------------------------------------------------------------
    Hello
    Second line
    Third line
    56306
    --------------------------------------------------------------
    ```
    
    and no output:
    ```
    Exception: Launching GatewayServer failed!
    Warning: Expected GatewayServer to output a port, but found no output.
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by kayousterhout <gi...@git.apache.org>.
Github user kayousterhout commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2067#discussion_r16509728
  
    --- Diff: python/pyspark/java_gateway.py ---
    @@ -54,12 +54,18 @@ def preexec_func():
                 gateway_port = proc.stdout.readline()
                 gateway_port = int(gateway_port)
             except ValueError:
    +            # Grab the remaining lines of stdout
                 (stdout, _) = proc.communicate()
                 exit_code = proc.poll()
                 error_msg = "Launching GatewayServer failed"
                 error_msg += " with exit code %d! " % exit_code if exit_code else "! "
    --- End diff --
    
    unrelated to your change but can you fix this too while you're at it? Looks like there's an extra "!" when there's no exit code


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by kayousterhout <gi...@git.apache.org>.
Github user kayousterhout commented on the pull request:

    https://github.com/apache/spark/pull/2067#issuecomment-52857461
  
    Jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2067#issuecomment-52862414
  
    Thanks, merged into master and 1.1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3140] Clarify confusing PySpark excepti...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/2067


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org