You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/03/28 01:52:24 UTC

[GitHub] [spark] Yikun commented on a change in pull request #35977: [SPARK-38660][PYTHON] PySpark DeprecationWarning: distutils Version classes are deprecated

Yikun commented on a change in pull request #35977:
URL: https://github.com/apache/spark/pull/35977#discussion_r836004084



##########
File path: python/pyspark/ml/image.py
##########
@@ -28,7 +28,7 @@
 from typing import Any, Dict, List, NoReturn, Optional, cast
 
 import numpy as np
-from distutils.version import LooseVersion
+from packaging.version import Version

Review comment:
       > Is this 3rd party library?
   
   https://pypi.org/project/packaging/ , yes, it is. And in the standard library, seems not have a way to handle this after 3.12.
   
   > Adding a new dep is problematic
   
   Looks like we might have below way to solve this:
   - Maintaince the code like `distutils`/`packaging` version in PySpark, just like [cloudpickle](https://github.com/apache/spark/tree/master/python/pyspark/cloudpickle), or at least a simple version implementations.
   - Introduce the packaging as 3rd party lib, add deps in setup and [docs](https://spark.apache.org/docs/latest/api/python/getting_started/install.html#dependencies).
   
   BTW, I just think it isn't an ideal way that we always matainence all 3rd deps in pyspark, have we considered the extra third-party dependent installations as a required/option step in the installation of pyspark (especially via downloading distribution)? such as require users install before pyspark startup or install deps in [`bin/pyspark`](https://github.com/apache/spark/blob/master/bin/pyspark) automatically (
   This may require additional network access).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org