You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2021/01/20 03:09:45 UTC

[spark] branch master updated: [SPARK-34162][DOCS][PYSPARK] Add PyArrow compatibility note for Python 3.9

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 7e1651e  [SPARK-34162][DOCS][PYSPARK] Add PyArrow compatibility note for Python 3.9
7e1651e is described below

commit 7e1651e3152c1af59393cb1ca16701fc2500e181
Author: Dongjoon Hyun <dh...@apple.com>
AuthorDate: Tue Jan 19 19:09:14 2021 -0800

    [SPARK-34162][DOCS][PYSPARK] Add PyArrow compatibility note for Python 3.9
    
    ### What changes were proposed in this pull request?
    
    This PR aims to add a note for Apache Arrow project's `PyArrow` compatibility for Python 3.9.
    
    ### Why are the changes needed?
    
    Although Apache Spark documentation claims `Spark runs on Java 8/11, Scala 2.12, Python 3.6+ and R 3.5+.`,
    Apache Arrow's `PyArrow` is not compatible with Python 3.9.x yet. Without installing `PyArrow` library, PySpark UTs passed without any problem. So, it would be enough to add a note for this limitation and the compatibility link of Apache Arrow website.
    - https://arrow.apache.org/docs/python/install.html#python-compatibility
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    **BEFORE**
    <img width="804" alt="Screen Shot 2021-01-19 at 1 45 07 PM" src="https://user-images.githubusercontent.com/9700541/105096867-8fbdbe00-5a5c-11eb-88f7-8caae2427583.png">
    
    **AFTER**
    <img width="908" alt="Screen Shot 2021-01-19 at 7 06 41 PM" src="https://user-images.githubusercontent.com/9700541/105121661-85fe7f80-5a89-11eb-8af7-1b37e12c55c1.png">
    
    Closes #31251 from dongjoon-hyun/SPARK-34162.
    
    Authored-by: Dongjoon Hyun <dh...@apple.com>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 docs/index.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/index.md b/docs/index.md
index c4c2d72..84f760f 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -50,6 +50,7 @@ For the Scala API, Spark {{site.SPARK_VERSION}}
 uses Scala {{site.SCALA_BINARY_VERSION}}. You will need to use a compatible Scala version
 ({{site.SCALA_BINARY_VERSION}}.x).
 
+For Python 3.9, Arrow optimization and pandas UDFs might not work due to the supported Python versions in Apache Arrow. Please refer to the latest [Python Compatibility](https://arrow.apache.org/docs/python/install.html#python-compatibility) page.
 For Java 11, `-Dio.netty.tryReflectionSetAccessible=true` is required additionally for Apache Arrow library. This prevents `java.lang.UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.(long, int) not available` when Apache Arrow uses Netty internally.
 
 # Running the Examples and Shell


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org