You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by holdenk <gi...@git.apache.org> on 2017/05/07 03:23:28 UTC

[GitHub] spark pull request #17885: [SPARK-20627][PYSPARK] Drop the hadoop distirbuti...

GitHub user holdenk opened a pull request:

    https://github.com/apache/spark/pull/17885

    [SPARK-20627][PYSPARK] Drop the hadoop distirbution name from the Python version

    ## What changes were proposed in this pull request?
    
    Drop the hadoop distirbution name from the Python version (PEP440).
    
    ## How was this patch tested?
    
    Ran `make-distribution` locally

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/holdenk/spark SPARK-20627-remove-pip-local-version-string

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17885.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17885
    
----
commit 4e30ba90a7f14627d098d676f1ee8bf02d62eb9e
Author: Holden Karau <ho...@us.ibm.com>
Date:   2017-05-07T02:40:40Z

    Drop the hadoop distirbution name from the Python version packaging string

commit 99414d7ce352d7d4dd32a9ad4eda93c11d360cac
Author: Holden Karau <ho...@us.ibm.com>
Date:   2017-05-07T03:22:02Z

    Update comment since we don't have name

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17885: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17885
  
    Could you post the original section about `local versions should not be used when publishing up-stream`? 
    
    It sounds like PEP0440 does not encourage it. Below is what I found
    > The inclusion of the local version label makes it possible to differentiate upstream releases from potentially altered rebuilds by downstream integrators. The use of a local version identifier does not affect the kind of a release but, when applied to a source distribution, does indicate that it may not contain the exact same code as the corresponding upstream release.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17885: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17885
  
    Could you post the changes you made in the PR description and explain why it resolves PEP-0440? It might help more people understand the impacts of this PR by reading the PR description. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17885: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name...

Posted by holdenk <gi...@git.apache.org>.

Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/17885
  
    Updated with more explanation of what we changed in the PR description.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #17885: [SPARK-20627][PYSPARK] Drop the hadoop distirbuti...

Posted by holdenk <gi...@git.apache.org>.

Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17885#discussion_r115138020
  
    --- Diff: dev/create-release/release-build.sh ---
    @@ -163,9 +163,9 @@ if [[ "$1" == "package" ]]; then
         export ZINC_PORT=$ZINC_PORT
         echo "Creating distribution: $NAME ($FLAGS)"
     
    -    # Write out the NAME and VERSION to PySpark version info we rewrite the - into a . and SNAPSHOT
    -    # to dev0 to be closer to PEP440. We use the NAME as a "local version".
    -    PYSPARK_VERSION=`echo "$SPARK_VERSION+$NAME" |  sed -r "s/-/./" | sed -r "s/SNAPSHOT/dev0/"`
    +    # Write out the VERSION to PySpark version info we rewrite the - into a . and SNAPSHOT
    +    # to dev0 to be closer to PEP440.
    +    PYSPARK_VERSION=`echo "$SPARK_VERSION" |  sed -r "s/-/./" | sed -r "s/SNAPSHOT/dev0/"`
    --- End diff --
    
    So we currently only package Python for one Hadoop version. If we start doing multiple Hadoop versions for Python we can figure out how to handle that again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17885: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name...

Posted by holdenk <gi...@git.apache.org>.

Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/17885
  
    If there are no other comments I'm going to merge this tomorrow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #17885: [SPARK-20627][PYSPARK] Drop the hadoop distirbuti...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17885#discussion_r115137816
  
    --- Diff: dev/create-release/release-build.sh ---
    @@ -163,9 +163,9 @@ if [[ "$1" == "package" ]]; then
         export ZINC_PORT=$ZINC_PORT
         echo "Creating distribution: $NAME ($FLAGS)"
     
    -    # Write out the NAME and VERSION to PySpark version info we rewrite the - into a . and SNAPSHOT
    -    # to dev0 to be closer to PEP440. We use the NAME as a "local version".
    -    PYSPARK_VERSION=`echo "$SPARK_VERSION+$NAME" |  sed -r "s/-/./" | sed -r "s/SNAPSHOT/dev0/"`
    +    # Write out the VERSION to PySpark version info we rewrite the - into a . and SNAPSHOT
    +    # to dev0 to be closer to PEP440.
    +    PYSPARK_VERSION=`echo "$SPARK_VERSION" |  sed -r "s/-/./" | sed -r "s/SNAPSHOT/dev0/"`
    --- End diff --
    
    This also affects the `pyspark-*.tgz` artifact name. It seems like this means the same file name will be used for different flavors of the release. If they're identical anyway it's just redundant, but are they? I don't know this part well so might be misunderstanding what this would do.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #17885: [SPARK-20627][PYSPARK] Drop the hadoop distirbuti...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/17885


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17885: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17885
  
    **[Test build #76535 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76535/testReport)** for PR 17885 at commit [`99414d7`](https://github.com/apache/spark/commit/99414d7ce352d7d4dd32a9ad4eda93c11d360cac).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17885: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name...

Posted by holdenk <gi...@git.apache.org>.

Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/17885
  
    Merged to master, branch-2.2, and branch-2.1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17885: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17885
  
    **[Test build #76535 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76535/testReport)** for PR 17885 at commit [`99414d7`](https://github.com/apache/spark/commit/99414d7ce352d7d4dd32a9ad4eda93c11d360cac).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17885: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name...

Posted by holdenk <gi...@git.apache.org>.

Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/17885
  
    I'll target this for master, branch-2.2, branch-2.1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17885: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17885
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76535/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17885: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17885
  
    Are you referring to https://www.python.org/dev/peps/pep-0440/ ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #17885: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17885
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org