You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2023/06/02 10:39:46 UTC

[spark] branch branch-3.4 updated: [SPARK-43949][PYTHON] Upgrade cloudpickle to 2.2.1

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
     new 746c906cce7 [SPARK-43949][PYTHON] Upgrade cloudpickle to 2.2.1
746c906cce7 is described below

commit 746c906cce76992ad531a1b6b57e05470085c917
Author: Hyukjin Kwon <gu...@apache.org>
AuthorDate: Fri Jun 2 19:38:13 2023 +0900

    [SPARK-43949][PYTHON] Upgrade cloudpickle to 2.2.1
    
    This PR proposes to upgrade Cloudpickle from 2.2.0 to 2.2.1.
    
    Cloudpickle 2.2.1 has a fix (https://github.com/cloudpipe/cloudpickle/pull/495) for namedtuple issue (https://github.com/cloudpipe/cloudpickle/issues/460). PySpark relies on namedtuple heavily especially for RDD. We should upgrade and fix it.
    
    Yes, see https://github.com/cloudpipe/cloudpickle/issues/460.
    
    Relies on cloudpickle's unittests. Existing test cases should pass too.
    
    Closes #41433 from HyukjinKwon/cloudpickle-upgrade.
    
    Authored-by: Hyukjin Kwon <gu...@apache.org>
    Signed-off-by: Hyukjin Kwon <gu...@apache.org>
    (cherry picked from commit 085dfeb2bed61f6d43d9b99b299373e797ac8f17)
    Signed-off-by: Hyukjin Kwon <gu...@apache.org>
---
 python/pyspark/cloudpickle/__init__.py         | 2 +-
 python/pyspark/cloudpickle/cloudpickle_fast.py | 4 ++--
 python/pyspark/cloudpickle/compat.py           | 7 ++++++-
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/python/pyspark/cloudpickle/__init__.py b/python/pyspark/cloudpickle/__init__.py
index efbf1178d43..af35a0a194b 100644
--- a/python/pyspark/cloudpickle/__init__.py
+++ b/python/pyspark/cloudpickle/__init__.py
@@ -5,4 +5,4 @@ from pyspark.cloudpickle.cloudpickle_fast import CloudPickler, dumps, dump  # no
 # expose their Pickler subclass at top-level under the  "Pickler" name.
 Pickler = CloudPickler
 
-__version__ = '2.2.0'
+__version__ = '2.2.1'
diff --git a/python/pyspark/cloudpickle/cloudpickle_fast.py b/python/pyspark/cloudpickle/cloudpickle_fast.py
index 8741dcbdaaa..63aaffa096b 100644
--- a/python/pyspark/cloudpickle/cloudpickle_fast.py
+++ b/python/pyspark/cloudpickle/cloudpickle_fast.py
@@ -111,8 +111,8 @@ load, loads = pickle.load, pickle.loads
 
 def _class_getnewargs(obj):
     type_kwargs = {}
-    if "__slots__" in obj.__dict__:
-        type_kwargs["__slots__"] = obj.__slots__
+    if "__module__" in obj.__dict__:
+        type_kwargs["__module__"] = obj.__module__
 
     __dict__ = obj.__dict__.get('__dict__', None)
     if isinstance(__dict__, property):
diff --git a/python/pyspark/cloudpickle/compat.py b/python/pyspark/cloudpickle/compat.py
index afa285f6290..5e9b52773d2 100644
--- a/python/pyspark/cloudpickle/compat.py
+++ b/python/pyspark/cloudpickle/compat.py
@@ -7,7 +7,12 @@ if sys.version_info < (3, 8):
         from pickle5 import Pickler  # noqa: F401
     except ImportError:
         import pickle  # noqa: F401
+
+        # Use the Python pickler for old CPython versions
         from pickle import _Pickler as Pickler  # noqa: F401
 else:
     import pickle  # noqa: F401
-    from _pickle import Pickler  # noqa: F401
+
+    # Pickler will the C implementation in CPython and the Python
+    # implementation in PyPy
+    from pickle import Pickler  # noqa: F401


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org