You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by xuanyuanking <gi...@git.apache.org> on 2018/09/09 04:29:06 UTC

[GitHub] spark pull request #22369: [SPARK-25072][DOC] Update migration guide for beh...

GitHub user xuanyuanking opened a pull request:

    https://github.com/apache/spark/pull/22369

    [SPARK-25072][DOC] Update migration guide for behavior change

    ## What changes were proposed in this pull request?
    
    Update the document for the behavior change in PySpark Row creation.
    
    ## How was this patch tested?
    
    Existing UT.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xuanyuanking/spark SPARK-25072-DOC

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22369.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22369
    
----
commit d257a38c647b45a9e83a2bdbbd2814f1b3fc5d56
Author: Yuanjian Li <xy...@...>
Date:   2018-09-09T04:26:23Z

    Update doc for SPARK-25072

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22369: [SPARK-25072][DOC] Update migration guide for beh...

Posted by BryanCutler <gi...@git.apache.org>.
Github user BryanCutler commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22369#discussion_r216147674
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1901,6 +1901,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see
     ## Upgrading From Spark SQL 2.3.0 to 2.3.1 and above
     
       - As of version 2.3.1 Arrow functionality, including `pandas_udf` and `toPandas()`/`createDataFrame()` with `spark.sql.execution.arrow.enabled` set to `True`, has been marked as experimental. These are still evolving and not currently recommended for use in production.
    +  - In version 2.3.1 and earlier, it is possible for PySpark to create a Row object by providing more value than column number through the customized Row class. Since Spark 2.3.3, Spark will confirm value length is less or equal than column length in PySpark. See [SPARK-25072](https://issues.apache.org/jira/browse/SPARK-25072) for details.
    --- End diff --
    
    Maybe say `..by providing more values than number of fields through a customized Row class. As of Spark 2.3.3, PySpark will raise a ValueError if the number of values are more than the number of fields. See...`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22369: [SPARK-25072][DOC] Update migration guide for beh...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking closed the pull request at:

    https://github.com/apache/spark/pull/22369


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22369: [SPARK-25072][DOC] Update migration guide for behavior c...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22369
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2951/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22369: [SPARK-25072][DOC] Update migration guide for behavior c...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22369
  
    @xuanyuanking, no need to rush. Let's wait and discuss a bit more before proposing a change.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22369: [SPARK-25072][DOC] Update migration guide for behavior c...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22369
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22369: [SPARK-25072][DOC] Update migration guide for behavior c...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22369
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22369: [SPARK-25072][DOC] Update migration guide for behavior c...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on the issue:

    https://github.com/apache/spark/pull/22369
  
    As the comment in https://github.com/apache/spark/pull/22140#issuecomment-419997180, I think this doc change is no more needed, I just close this, thanks @BryanCutler and @HyukjinKwon !


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22369: [SPARK-25072][DOC] Update migration guide for beh...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22369#discussion_r216189359
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1901,6 +1901,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see
     ## Upgrading From Spark SQL 2.3.0 to 2.3.1 and above
     
       - As of version 2.3.1 Arrow functionality, including `pandas_udf` and `toPandas()`/`createDataFrame()` with `spark.sql.execution.arrow.enabled` set to `True`, has been marked as experimental. These are still evolving and not currently recommended for use in production.
    +  - In version 2.3.1 and earlier, it is possible for PySpark to create a Row object by providing more value than column number through the customized Row class. Since Spark 2.3.3, Spark will confirm value length is less or equal than column length in PySpark. See [SPARK-25072](https://issues.apache.org/jira/browse/SPARK-25072) for details.
    --- End diff --
    
    Thanks Bryan, I'll address this after discussion.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22369: [SPARK-25072][DOC] Update migration guide for behavior c...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22369
  
    **[Test build #95842 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95842/testReport)** for PR 22369 at commit [`d257a38`](https://github.com/apache/spark/commit/d257a38c647b45a9e83a2bdbbd2814f1b3fc5d56).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22369: [SPARK-25072][DOC] Update migration guide for behavior c...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22369
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95842/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22369: [SPARK-25072][DOC] Update migration guide for behavior c...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22369
  
    **[Test build #95842 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95842/testReport)** for PR 22369 at commit [`d257a38`](https://github.com/apache/spark/commit/d257a38c647b45a9e83a2bdbbd2814f1b3fc5d56).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22369: [SPARK-25072][DOC] Update migration guide for behavior c...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on the issue:

    https://github.com/apache/spark/pull/22369
  
    Got it, thanks @HyukjinKwon.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org