You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by JoshRosen <gi...@git.apache.org> on 2015/04/05 23:07:10 UTC

[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

GitHub user JoshRosen opened a pull request:

    https://github.com/apache/spark/pull/5372

    [SPARK-6716] Change SparkContext.DRIVER_IDENTIFIER from <driver> to driver

    Currently, the driver's executorId is set to `<driver>`. This choice of ID was present in older Spark versions, but it has started to cause problems now that executorIds are used in more contexts, such as Ganglia metric names or driver thread-dump links the web UI. The angle brackets must be escaped when embedding this ID in XML or as part of URLs and this has led to multiple problems:
    
    - https://issues.apache.org/jira/browse/SPARK-6484
    - https://issues.apache.org/jira/browse/SPARK-4313
    
    The simplest solution seems to be to change this id to something that does not contain any special characters, such as `driver`.
    
    I'm not sure whether we can perform this change in a patch release, since this ID may be considered a stable API by metrics users, but it's probably okay to do this in a major release as long as we document it in the release notes.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JoshRosen/spark driver-id-fix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5372.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5372
    
----
commit 7ff12e049377f0bcfe6491a33eea2f020cde77d9
Author: Josh Rosen <jo...@databricks.com>
Date:   2015-04-05T20:56:34Z

    Change SparkContext.DRIVER_IDENTIFIER from <driver> to driver

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/5372


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/5372#issuecomment-89852637
  
    /cc @marmbrus @jerryshao 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/5372#issuecomment-90034656
  
    Left a very minor comment but otherwise LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5372#issuecomment-90169557
  
      [Test build #29742 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29742/consoleFull) for   PR 5372 at commit [`42d3c10`](https://github.com/apache/spark/commit/42d3c10beffdeb637de4a8f77fbd1a7fc0473d9b).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5372#issuecomment-89889413
  
      [Test build #29734 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29734/consoleFull) for   PR 5372 at commit [`0c5d04b`](https://github.com/apache/spark/commit/0c5d04bf0b261f4de3cec69cfb06a5289bc6f17f).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5372#discussion_r27794465
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1900,7 +1900,17 @@ object SparkContext extends Logging {
     
       private[spark] val SPARK_JOB_INTERRUPT_ON_CANCEL = "spark.job.interruptOnCancel"
     
    -  private[spark] val DRIVER_IDENTIFIER = "<driver>"
    +  /**
    +   * Executor id for the driver.  In earlier versions of Spark, this was `<driver>`, but this was
    +   * changed to `driver` because the angle brackets caused escaping issues in URLs and XML (see
    +   * SPARK-6716 for more details).
    +   */
    +  private[spark] val DRIVER_IDENTIFIER = "driver"
    +
    +  /**
    +   * Legacy version of DRIVER_IDENTIFIER, retained for use in backwards-compatibility tests.
    --- End diff --
    
    This could be interpreted as implying it is only needed for tests, however, this is actually legitimately needed for reading old event logs. It might make more sense to just say it's needed for backwards compatibility.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/5372#issuecomment-89869007
  
    Do we need to modify `isDriver` to accept the old version of the identifier as well? For instance, if a newer version of the History server is playing logs from an older version of Spark, where the string "<driver>" is coded in the event data for the addition of an executor.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5372#issuecomment-89853278
  
      [Test build #29732 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29732/consoleFull) for   PR 5372 at commit [`7ff12e0`](https://github.com/apache/spark/commit/7ff12e049377f0bcfe6491a33eea2f020cde77d9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5372#issuecomment-89889430
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29734/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/5372#issuecomment-90394132
  
    Good call on the original comment wording being potentially confusing.  I've updated this as suggested.
    
    I'm going to merge this into `master` (1.4.0).  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5372#issuecomment-89874050
  
      [Test build #29734 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29734/consoleFull) for   PR 5372 at commit [`0c5d04b`](https://github.com/apache/spark/commit/0c5d04bf0b261f4de3cec69cfb06a5289bc6f17f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5372#issuecomment-89866536
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29732/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5372#issuecomment-89866535
  
      [Test build #29732 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29732/consoleFull) for   PR 5372 at commit [`7ff12e0`](https://github.com/apache/spark/commit/7ff12e049377f0bcfe6491a33eea2f020cde77d9).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/5372#issuecomment-89869800
  
    Good call; I'll update `BlockManagerId.isDriver` and add a test.  Note that we don't need to update `SparkContext.getExecutorThreadDump` since thread-dumps are disabled in the HistoryServer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5372#issuecomment-90128978
  
      [Test build #29742 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29742/consoleFull) for   PR 5372 at commit [`42d3c10`](https://github.com/apache/spark/commit/42d3c10beffdeb637de4a8f77fbd1a7fc0473d9b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6716] Change SparkContext.DRIVER_IDENTI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5372#issuecomment-90169590
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29742/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org