You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2017/10/24 03:44:52 UTC

spark git commit: [SPARK-22313][PYTHON] Mark/print deprecation warnings as DeprecationWarning for deprecated APIs

Repository: spark
Updated Branches:
  refs/heads/master 884d4f95f -> d9798c834


[SPARK-22313][PYTHON] Mark/print deprecation warnings as DeprecationWarning for deprecated APIs

## What changes were proposed in this pull request?

This PR proposes to mark the existing warnings as `DeprecationWarning` and print out warnings for deprecated functions.

This could be actually useful for Spark app developers. I use (old) PyCharm and this IDE can detect this specific `DeprecationWarning` in some cases:

**Before**

<img src="https://user-images.githubusercontent.com/6477701/31762664-df68d9f8-b4f6-11e7-8773-f0468f70a2cc.png" height="45" />

**After**

<img src="https://user-images.githubusercontent.com/6477701/31762662-de4d6868-b4f6-11e7-98dc-3c8446a0c28a.png" height="70" />

For console usage, `DeprecationWarning` is usually disabled (see https://docs.python.org/2/library/warnings.html#warning-categories and https://docs.python.org/3/library/warnings.html#warning-categories):

```
>>> import warnings
>>> filter(lambda f: f[2] == DeprecationWarning, warnings.filters)
[('ignore', <_sre.SRE_Pattern object at 0x10ba58c00>, <type 'exceptions.DeprecationWarning'>, <_sre.SRE_Pattern object at 0x10bb04138>, 0), ('ignore', None, <type 'exceptions.DeprecationWarning'>, None, 0)]
```

so, it won't actually mess up the terminal much unless it is intended.

If this is intendedly enabled, it'd should as below:

```
>>> import warnings
>>> warnings.simplefilter('always', DeprecationWarning)
>>>
>>> from pyspark.sql import functions
>>> functions.approxCountDistinct("a")
.../spark/python/pyspark/sql/functions.py:232: DeprecationWarning: Deprecated in 2.1, use approx_count_distinct instead.
  "Deprecated in 2.1, use approx_count_distinct instead.", DeprecationWarning)
...
```

These instances were found by:

```
cd python/pyspark
grep -r "Deprecated" .
grep -r "deprecated" .
grep -r "deprecate" .
```

## How was this patch tested?

Manually tested.

Author: hyukjinkwon <gu...@gmail.com>

Closes #19535 from HyukjinKwon/deprecated-warning.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d9798c83
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d9798c83
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d9798c83

Branch: refs/heads/master
Commit: d9798c834f3fed060cfd18a8d38c398cb2efcc82
Parents: 884d4f9
Author: hyukjinkwon <gu...@gmail.com>
Authored: Tue Oct 24 12:44:47 2017 +0900
Committer: hyukjinkwon <gu...@gmail.com>
Committed: Tue Oct 24 12:44:47 2017 +0900

----------------------------------------------------------------------
 python/pyspark/ml/util.py              |  8 +++-
 python/pyspark/mllib/classification.py |  2 +-
 python/pyspark/mllib/evaluation.py     |  6 +--
 python/pyspark/mllib/regression.py     |  8 ++--
 python/pyspark/sql/dataframe.py        |  3 ++
 python/pyspark/sql/functions.py        | 18 ++++++++
 python/pyspark/streaming/flume.py      | 14 +++++-
 python/pyspark/streaming/kafka.py      | 72 +++++++++++++++++++++++++----
 8 files changed, 110 insertions(+), 21 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/d9798c83/python/pyspark/ml/util.py
----------------------------------------------------------------------
diff --git a/python/pyspark/ml/util.py b/python/pyspark/ml/util.py
index 6777291..c3c47bd 100644
--- a/python/pyspark/ml/util.py
+++ b/python/pyspark/ml/util.py
@@ -175,7 +175,9 @@ class JavaMLWriter(MLWriter):
 
         .. note:: Deprecated in 2.1 and will be removed in 3.0, use session instead.
         """
-        warnings.warn("Deprecated in 2.1 and will be removed in 3.0, use session instead.")
+        warnings.warn(
+            "Deprecated in 2.1 and will be removed in 3.0, use session instead.",
+            DeprecationWarning)
         self._jwrite.context(sqlContext._ssql_ctx)
         return self
 
@@ -256,7 +258,9 @@ class JavaMLReader(MLReader):
 
         .. note:: Deprecated in 2.1 and will be removed in 3.0, use session instead.
         """
-        warnings.warn("Deprecated in 2.1 and will be removed in 3.0, use session instead.")
+        warnings.warn(
+            "Deprecated in 2.1 and will be removed in 3.0, use session instead.",
+            DeprecationWarning)
         self._jread.context(sqlContext._ssql_ctx)
         return self
 

http://git-wip-us.apache.org/repos/asf/spark/blob/d9798c83/python/pyspark/mllib/classification.py
----------------------------------------------------------------------
diff --git a/python/pyspark/mllib/classification.py b/python/pyspark/mllib/classification.py
index e04eeb2..cce703d 100644
--- a/python/pyspark/mllib/classification.py
+++ b/python/pyspark/mllib/classification.py
@@ -311,7 +311,7 @@ class LogisticRegressionWithSGD(object):
         """
         warnings.warn(
             "Deprecated in 2.0.0. Use ml.classification.LogisticRegression or "
-            "LogisticRegressionWithLBFGS.")
+            "LogisticRegressionWithLBFGS.", DeprecationWarning)
 
         def train(rdd, i):
             return callMLlibFunc("trainLogisticRegressionModelWithSGD", rdd, int(iterations),

http://git-wip-us.apache.org/repos/asf/spark/blob/d9798c83/python/pyspark/mllib/evaluation.py
----------------------------------------------------------------------
diff --git a/python/pyspark/mllib/evaluation.py b/python/pyspark/mllib/evaluation.py
index fc2a0b3..2cd1da3 100644
--- a/python/pyspark/mllib/evaluation.py
+++ b/python/pyspark/mllib/evaluation.py
@@ -234,7 +234,7 @@ class MulticlassMetrics(JavaModelWrapper):
         """
         if label is None:
             # note:: Deprecated in 2.0.0. Use accuracy.
-            warnings.warn("Deprecated in 2.0.0. Use accuracy.")
+            warnings.warn("Deprecated in 2.0.0. Use accuracy.", DeprecationWarning)
             return self.call("precision")
         else:
             return self.call("precision", float(label))
@@ -246,7 +246,7 @@ class MulticlassMetrics(JavaModelWrapper):
         """
         if label is None:
             # note:: Deprecated in 2.0.0. Use accuracy.
-            warnings.warn("Deprecated in 2.0.0. Use accuracy.")
+            warnings.warn("Deprecated in 2.0.0. Use accuracy.", DeprecationWarning)
             return self.call("recall")
         else:
             return self.call("recall", float(label))
@@ -259,7 +259,7 @@ class MulticlassMetrics(JavaModelWrapper):
         if beta is None:
             if label is None:
                 # note:: Deprecated in 2.0.0. Use accuracy.
-                warnings.warn("Deprecated in 2.0.0. Use accuracy.")
+                warnings.warn("Deprecated in 2.0.0. Use accuracy.", DeprecationWarning)
                 return self.call("fMeasure")
             else:
                 return self.call("fMeasure", label)

http://git-wip-us.apache.org/repos/asf/spark/blob/d9798c83/python/pyspark/mllib/regression.py
----------------------------------------------------------------------
diff --git a/python/pyspark/mllib/regression.py b/python/pyspark/mllib/regression.py
index 1b66f5b..ea107d4 100644
--- a/python/pyspark/mllib/regression.py
+++ b/python/pyspark/mllib/regression.py
@@ -278,7 +278,8 @@ class LinearRegressionWithSGD(object):
           A condition which decides iteration termination.
           (default: 0.001)
         """
-        warnings.warn("Deprecated in 2.0.0. Use ml.regression.LinearRegression.")
+        warnings.warn(
+            "Deprecated in 2.0.0. Use ml.regression.LinearRegression.", DeprecationWarning)
 
         def train(rdd, i):
             return callMLlibFunc("trainLinearRegressionModelWithSGD", rdd, int(iterations),
@@ -421,7 +422,8 @@ class LassoWithSGD(object):
         """
         warnings.warn(
             "Deprecated in 2.0.0. Use ml.regression.LinearRegression with elasticNetParam = 1.0. "
-            "Note the default regParam is 0.01 for LassoWithSGD, but is 0.0 for LinearRegression.")
+            "Note the default regParam is 0.01 for LassoWithSGD, but is 0.0 for LinearRegression.",
+            DeprecationWarning)
 
         def train(rdd, i):
             return callMLlibFunc("trainLassoModelWithSGD", rdd, int(iterations), float(step),
@@ -566,7 +568,7 @@ class RidgeRegressionWithSGD(object):
         warnings.warn(
             "Deprecated in 2.0.0. Use ml.regression.LinearRegression with elasticNetParam = 0.0. "
             "Note the default regParam is 0.01 for RidgeRegressionWithSGD, but is 0.0 for "
-            "LinearRegression.")
+            "LinearRegression.", DeprecationWarning)
 
         def train(rdd, i):
             return callMLlibFunc("trainRidgeModelWithSGD", rdd, int(iterations), float(step),

http://git-wip-us.apache.org/repos/asf/spark/blob/d9798c83/python/pyspark/sql/dataframe.py
----------------------------------------------------------------------
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index 38b01f0..c0b574e 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -130,6 +130,8 @@ class DataFrame(object):
 
         .. note:: Deprecated in 2.0, use createOrReplaceTempView instead.
         """
+        warnings.warn(
+            "Deprecated in 2.0, use createOrReplaceTempView instead.", DeprecationWarning)
         self._jdf.createOrReplaceTempView(name)
 
     @since(2.0)
@@ -1308,6 +1310,7 @@ class DataFrame(object):
 
         .. note:: Deprecated in 2.0, use :func:`union` instead.
         """
+        warnings.warn("Deprecated in 2.0, use union instead.", DeprecationWarning)
         return self.union(other)
 
     @since(2.3)

http://git-wip-us.apache.org/repos/asf/spark/blob/d9798c83/python/pyspark/sql/functions.py
----------------------------------------------------------------------
diff --git a/python/pyspark/sql/functions.py b/python/pyspark/sql/functions.py
index 9bc374b..0d40368 100644
--- a/python/pyspark/sql/functions.py
+++ b/python/pyspark/sql/functions.py
@@ -21,6 +21,7 @@ A collections of builtin functions
 import math
 import sys
 import functools
+import warnings
 
 if sys.version < "3":
     from itertools import imap as map
@@ -44,6 +45,14 @@ def _create_function(name, doc=""):
     return _
 
 
+def _wrap_deprecated_function(func, message):
+    """ Wrap the deprecated function to print out deprecation warnings"""
+    def _(col):
+        warnings.warn(message, DeprecationWarning)
+        return func(col)
+    return functools.wraps(func)(_)
+
+
 def _create_binary_mathfunction(name, doc=""):
     """ Create a binary mathfunction by name"""
     def _(col1, col2):
@@ -207,6 +216,12 @@ _window_functions = {
         """returns the relative rank (i.e. percentile) of rows within a window partition.""",
 }
 
+# Wraps deprecated functions (keys) with the messages (values).
+_functions_deprecated = {
+    'toDegrees': 'Deprecated in 2.1, use degrees instead.',
+    'toRadians': 'Deprecated in 2.1, use radians instead.',
+}
+
 for _name, _doc in _functions.items():
     globals()[_name] = since(1.3)(_create_function(_name, _doc))
 for _name, _doc in _functions_1_4.items():
@@ -219,6 +234,8 @@ for _name, _doc in _functions_1_6.items():
     globals()[_name] = since(1.6)(_create_function(_name, _doc))
 for _name, _doc in _functions_2_1.items():
     globals()[_name] = since(2.1)(_create_function(_name, _doc))
+for _name, _message in _functions_deprecated.items():
+    globals()[_name] = _wrap_deprecated_function(globals()[_name], _message)
 del _name, _doc
 
 
@@ -227,6 +244,7 @@ def approxCountDistinct(col, rsd=None):
     """
     .. note:: Deprecated in 2.1, use :func:`approx_count_distinct` instead.
     """
+    warnings.warn("Deprecated in 2.1, use approx_count_distinct instead.", DeprecationWarning)
     return approx_count_distinct(col, rsd)
 
 

http://git-wip-us.apache.org/repos/asf/spark/blob/d9798c83/python/pyspark/streaming/flume.py
----------------------------------------------------------------------
diff --git a/python/pyspark/streaming/flume.py b/python/pyspark/streaming/flume.py
index 2fed594..5a975d0 100644
--- a/python/pyspark/streaming/flume.py
+++ b/python/pyspark/streaming/flume.py
@@ -54,8 +54,13 @@ class FlumeUtils(object):
         :param bodyDecoder:  A function used to decode body (default is utf8_decoder)
         :return: A DStream object
 
-        .. note:: Deprecated in 2.3.0
+        .. note:: Deprecated in 2.3.0. Flume support is deprecated as of Spark 2.3.0.
+            See SPARK-22142.
         """
+        warnings.warn(
+            "Deprecated in 2.3.0. Flume support is deprecated as of Spark 2.3.0. "
+            "See SPARK-22142.",
+            DeprecationWarning)
         jlevel = ssc._sc._getJavaStorageLevel(storageLevel)
         helper = FlumeUtils._get_helper(ssc._sc)
         jstream = helper.createStream(ssc._jssc, hostname, port, jlevel, enableDecompression)
@@ -82,8 +87,13 @@ class FlumeUtils(object):
         :param bodyDecoder:  A function used to decode body (default is utf8_decoder)
         :return: A DStream object
 
-        .. note:: Deprecated in 2.3.0
+        .. note:: Deprecated in 2.3.0. Flume support is deprecated as of Spark 2.3.0.
+            See SPARK-22142.
         """
+        warnings.warn(
+            "Deprecated in 2.3.0. Flume support is deprecated as of Spark 2.3.0. "
+            "See SPARK-22142.",
+            DeprecationWarning)
         jlevel = ssc._sc._getJavaStorageLevel(storageLevel)
         hosts = []
         ports = []

http://git-wip-us.apache.org/repos/asf/spark/blob/d9798c83/python/pyspark/streaming/kafka.py
----------------------------------------------------------------------
diff --git a/python/pyspark/streaming/kafka.py b/python/pyspark/streaming/kafka.py
index 4af4135..fdb9308 100644
--- a/python/pyspark/streaming/kafka.py
+++ b/python/pyspark/streaming/kafka.py
@@ -15,6 +15,8 @@
 # limitations under the License.
 #
 
+import warnings
+
 from py4j.protocol import Py4JJavaError
 
 from pyspark.rdd import RDD
@@ -56,8 +58,13 @@ class KafkaUtils(object):
         :param valueDecoder:  A function used to decode value (default is utf8_decoder)
         :return: A DStream object
 
-        .. note:: Deprecated in 2.3.0
+        .. note:: Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0.
+            See SPARK-21893.
         """
+        warnings.warn(
+            "Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0. "
+            "See SPARK-21893.",
+            DeprecationWarning)
         if kafkaParams is None:
             kafkaParams = dict()
         kafkaParams.update({
@@ -105,8 +112,13 @@ class KafkaUtils(object):
         :return: A DStream object
 
         .. note:: Experimental
-        .. note:: Deprecated in 2.3.0
+        .. note:: Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0.
+            See SPARK-21893.
         """
+        warnings.warn(
+            "Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0. "
+            "See SPARK-21893.",
+            DeprecationWarning)
         if fromOffsets is None:
             fromOffsets = dict()
         if not isinstance(topics, list):
@@ -159,8 +171,13 @@ class KafkaUtils(object):
         :return: An RDD object
 
         .. note:: Experimental
-        .. note:: Deprecated in 2.3.0
+        .. note:: Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0.
+            See SPARK-21893.
         """
+        warnings.warn(
+            "Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0. "
+            "See SPARK-21893.",
+            DeprecationWarning)
         if leaders is None:
             leaders = dict()
         if not isinstance(kafkaParams, dict):
@@ -229,7 +246,8 @@ class OffsetRange(object):
     """
     Represents a range of offsets from a single Kafka TopicAndPartition.
 
-    .. note:: Deprecated in 2.3.0
+    .. note:: Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0.
+        See SPARK-21893.
     """
 
     def __init__(self, topic, partition, fromOffset, untilOffset):
@@ -240,6 +258,10 @@ class OffsetRange(object):
         :param fromOffset: Inclusive starting offset.
         :param untilOffset: Exclusive ending offset.
         """
+        warnings.warn(
+            "Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0. "
+            "See SPARK-21893.",
+            DeprecationWarning)
         self.topic = topic
         self.partition = partition
         self.fromOffset = fromOffset
@@ -270,7 +292,8 @@ class TopicAndPartition(object):
     """
     Represents a specific topic and partition for Kafka.
 
-    .. note:: Deprecated in 2.3.0
+    .. note:: Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0.
+        See SPARK-21893.
     """
 
     def __init__(self, topic, partition):
@@ -279,6 +302,10 @@ class TopicAndPartition(object):
         :param topic: Kafka topic name.
         :param partition: Kafka partition id.
         """
+        warnings.warn(
+            "Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0. "
+            "See SPARK-21893.",
+            DeprecationWarning)
         self._topic = topic
         self._partition = partition
 
@@ -303,7 +330,8 @@ class Broker(object):
     """
     Represent the host and port info for a Kafka broker.
 
-    .. note:: Deprecated in 2.3.0
+    .. note:: Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0.
+        See SPARK-21893.
     """
 
     def __init__(self, host, port):
@@ -312,6 +340,10 @@ class Broker(object):
         :param host: Broker's hostname.
         :param port: Broker's port.
         """
+        warnings.warn(
+            "Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0. "
+            "See SPARK-21893.",
+            DeprecationWarning)
         self._host = host
         self._port = port
 
@@ -323,10 +355,15 @@ class KafkaRDD(RDD):
     """
     A Python wrapper of KafkaRDD, to provide additional information on normal RDD.
 
-    .. note:: Deprecated in 2.3.0
+    .. note:: Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0.
+        See SPARK-21893.
     """
 
     def __init__(self, jrdd, ctx, jrdd_deserializer):
+        warnings.warn(
+            "Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0. "
+            "See SPARK-21893.",
+            DeprecationWarning)
         RDD.__init__(self, jrdd, ctx, jrdd_deserializer)
 
     def offsetRanges(self):
@@ -345,10 +382,15 @@ class KafkaDStream(DStream):
     """
     A Python wrapper of KafkaDStream
 
-    .. note:: Deprecated in 2.3.0
+    .. note:: Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0.
+        See SPARK-21893.
     """
 
     def __init__(self, jdstream, ssc, jrdd_deserializer):
+        warnings.warn(
+            "Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0. "
+            "See SPARK-21893.",
+            DeprecationWarning)
         DStream.__init__(self, jdstream, ssc, jrdd_deserializer)
 
     def foreachRDD(self, func):
@@ -383,10 +425,15 @@ class KafkaTransformedDStream(TransformedDStream):
     """
     Kafka specific wrapper of TransformedDStream to transform on Kafka RDD.
 
-    .. note:: Deprecated in 2.3.0
+    .. note:: Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0.
+        See SPARK-21893.
     """
 
     def __init__(self, prev, func):
+        warnings.warn(
+            "Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0. "
+            "See SPARK-21893.",
+            DeprecationWarning)
         TransformedDStream.__init__(self, prev, func)
 
     @property
@@ -405,7 +452,8 @@ class KafkaMessageAndMetadata(object):
     """
     Kafka message and metadata information. Including topic, partition, offset and message
 
-    .. note:: Deprecated in 2.3.0
+    .. note:: Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0.
+        See SPARK-21893.
     """
 
     def __init__(self, topic, partition, offset, key, message):
@@ -419,6 +467,10 @@ class KafkaMessageAndMetadata(object):
         :param message: actual message payload of this Kafka message, the return data is
                         undecoded bytearray.
         """
+        warnings.warn(
+            "Deprecated in 2.3.0. Kafka 0.8 support is deprecated as of Spark 2.3.0. "
+            "See SPARK-21893.",
+            DeprecationWarning)
         self.topic = topic
         self.partition = partition
         self.offset = offset


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org