You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2023/06/02 10:38:29 UTC
[spark] branch master updated: [SPARK-43949][PYTHON] Upgrade cloudpickle to 2.2.1
This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 085dfeb2bed [SPARK-43949][PYTHON] Upgrade cloudpickle to 2.2.1
085dfeb2bed is described below
commit 085dfeb2bed61f6d43d9b99b299373e797ac8f17
Author: Hyukjin Kwon <gu...@apache.org>
AuthorDate: Fri Jun 2 19:38:13 2023 +0900
[SPARK-43949][PYTHON] Upgrade cloudpickle to 2.2.1
### What changes were proposed in this pull request?
This PR proposes to upgrade Cloudpickle from 2.2.0 to 2.2.1.
### Why are the changes needed?
Cloudpickle 2.2.1 has a fix (https://github.com/cloudpipe/cloudpickle/pull/495) for namedtuple issue (https://github.com/cloudpipe/cloudpickle/issues/460). PySpark relies on namedtuple heavily especially for RDD. We should upgrade and fix it.
### Does this PR introduce _any_ user-facing change?
Yes, see https://github.com/cloudpipe/cloudpickle/issues/460.
### How was this patch tested?
Relies on cloudpickle's unittests. Existing test cases should pass too.
Closes #41433 from HyukjinKwon/cloudpickle-upgrade.
Authored-by: Hyukjin Kwon <gu...@apache.org>
Signed-off-by: Hyukjin Kwon <gu...@apache.org>
---
python/pyspark/cloudpickle/__init__.py | 2 +-
python/pyspark/cloudpickle/cloudpickle_fast.py | 4 ++--
python/pyspark/cloudpickle/compat.py | 17 +++++++++++++++--
3 files changed, 18 insertions(+), 5 deletions(-)
diff --git a/python/pyspark/cloudpickle/__init__.py b/python/pyspark/cloudpickle/__init__.py
index efbf1178d43..af35a0a194b 100644
--- a/python/pyspark/cloudpickle/__init__.py
+++ b/python/pyspark/cloudpickle/__init__.py
@@ -5,4 +5,4 @@ from pyspark.cloudpickle.cloudpickle_fast import CloudPickler, dumps, dump # no
# expose their Pickler subclass at top-level under the "Pickler" name.
Pickler = CloudPickler
-__version__ = '2.2.0'
+__version__ = '2.2.1'
diff --git a/python/pyspark/cloudpickle/cloudpickle_fast.py b/python/pyspark/cloudpickle/cloudpickle_fast.py
index 8741dcbdaaa..63aaffa096b 100644
--- a/python/pyspark/cloudpickle/cloudpickle_fast.py
+++ b/python/pyspark/cloudpickle/cloudpickle_fast.py
@@ -111,8 +111,8 @@ load, loads = pickle.load, pickle.loads
def _class_getnewargs(obj):
type_kwargs = {}
- if "__slots__" in obj.__dict__:
- type_kwargs["__slots__"] = obj.__slots__
+ if "__module__" in obj.__dict__:
+ type_kwargs["__module__"] = obj.__module__
__dict__ = obj.__dict__.get('__dict__', None)
if isinstance(__dict__, property):
diff --git a/python/pyspark/cloudpickle/compat.py b/python/pyspark/cloudpickle/compat.py
index 837d0f279ab..5e9b52773d2 100644
--- a/python/pyspark/cloudpickle/compat.py
+++ b/python/pyspark/cloudpickle/compat.py
@@ -1,5 +1,18 @@
import sys
-import pickle # noqa: F401
-from pickle import Pickler # noqa: F401
+if sys.version_info < (3, 8):
+ try:
+ import pickle5 as pickle # noqa: F401
+ from pickle5 import Pickler # noqa: F401
+ except ImportError:
+ import pickle # noqa: F401
+
+ # Use the Python pickler for old CPython versions
+ from pickle import _Pickler as Pickler # noqa: F401
+else:
+ import pickle # noqa: F401
+
+ # Pickler will the C implementation in CPython and the Python
+ # implementation in PyPy
+ from pickle import Pickler # noqa: F401
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org