You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2020/06/24 08:29:00 UTC

[jira] [Resolved] (SPARK-31341) Spark documentation incorrectly claims 3.8 compatibility

     [ https://issues.apache.org/jira/browse/SPARK-31341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-31341.
----------------------------------
    Resolution: Cannot Reproduce

It's fixed in 3.0.

> Spark documentation incorrectly claims 3.8 compatibility
> --------------------------------------------------------
>
>                 Key: SPARK-31341
>                 URL: https://issues.apache.org/jira/browse/SPARK-31341
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.4.5
>            Reporter: Daniel King
>            Priority: Major
>
> The Spark documentation ([https://spark.apache.org/docs/latest/]) has this text:
> {quote}Spark runs on Java 8, Python 2.7+/3.4+ and R 3.1+. For the Scala API, Spark 2.4.5 uses Scala 2.12. You will need to use a compatible Scala version (2.12.x).
> {quote}
> Which suggests that Spark is compatible with Python 3.8. This is not true. For example in the latest ubuntu:18.04 docker image:
>  
> {code:python}
> apt-get update
> apt-get install python3.8 python3-pip
> pip3 install pyspark
> python3.8 -m pip install pyspark
> python3.8 -c 'import pyspark'
> {code}
> Outputs:
> {code:python}
> Traceback (most recent call last):
>   File "<string>", line 1, in <module>
>   File "/usr/local/lib/python3.8/dist-packages/pyspark/__init__.py", line 51, in <module>
>     from pyspark.context import SparkContext
>   File "/usr/local/lib/python3.8/dist-packages/pyspark/context.py", line 31, in <module>
>     from pyspark import accumulators
>   File "/usr/local/lib/python3.8/dist-packages/pyspark/accumulators.py", line 97, in <module>
>     from pyspark.serializers import read_int, PickleSerializer
>   File "/usr/local/lib/python3.8/dist-packages/pyspark/serializers.py", line 72, in <module>
>     from pyspark import cloudpickle
>   File "/usr/local/lib/python3.8/dist-packages/pyspark/cloudpickle.py", line 145, in <module>
>     _cell_set_template_code = _make_cell_set_template_code()
>   File "/usr/local/lib/python3.8/dist-packages/pyspark/cloudpickle.py", line 126, in _make_cell_set_template_code
>     return types.CodeType(
> TypeError: an integer is required (got type bytes)
> {code}
> I propose the documentation is updated to say "Python 3.4 to 3.7". I also propose the `setup.py` file for pyspark include:
> {code:python}
>     python_requires=">=3.6,<3.8",
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org