You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2020/12/10 03:41:00 UTC

[jira] [Updated] (SPARK-33730) Standardize warning types

     [ https://issues.apache.org/jira/browse/SPARK-33730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-33730:
---------------------------------
    Description: 
We should use warnings properly per https://docs.python.org/3/library/warnings.html#warning-categories

In particular, we should use {{FutureWarning}} instead of {{DeprecationWarning}} if we aim to show the warnings by default.

Current warnings are a bit messy and somewhat arbitrary.

To be more explicit, we'll have to fix:

{code}
pyspark/cloudpickle/cloudpickle.py:    warnings.warn(
pyspark/context.py:                    warnings.warn(
pyspark/context.py:                warnings.warn(
pyspark/ml/classification.py:                warnings.warn("weightCol is ignored, "
pyspark/ml/clustering.py:        warnings.warn("Deprecated in 3.0.0. It will be removed in future versions. Use "
pyspark/mllib/classification.py:        warnings.warn(
pyspark/mllib/feature.py:            warnings.warn("Both withMean and withStd are false. The model does nothing.")
pyspark/mllib/regression.py:        warnings.warn(
pyspark/mllib/regression.py:        warnings.warn(
pyspark/mllib/regression.py:        warnings.warn(
pyspark/rdd.py:        warnings.warn("mapPartitionsWithSplit is deprecated; "
pyspark/rdd.py:        warnings.warn(
pyspark/shell.py:    warnings.warn("Failed to initialize Spark session.")
pyspark/shuffle.py:            warnings.warn("Please install psutil to have better "
pyspark/sql/catalog.py:        warnings.warn(
pyspark/sql/catalog.py:        warnings.warn(
pyspark/sql/column.py:            warnings.warn(
pyspark/sql/column.py:            warnings.warn(
pyspark/sql/context.py:            warnings.warn(
pyspark/sql/context.py:        warnings.warn(
pyspark/sql/context.py:        warnings.warn(
pyspark/sql/context.py:        warnings.warn(
pyspark/sql/context.py:        warnings.warn(
pyspark/sql/dataframe.py:        warnings.warn(
pyspark/sql/dataframe.py:                warnings.warn("to_replace is a dict and value is not None. value will be ignored.")
pyspark/sql/functions.py:    warnings.warn("Deprecated in 2.1, use degrees instead.", DeprecationWarning)
pyspark/sql/functions.py:    warnings.warn("Deprecated in 2.1, use radians instead.", DeprecationWarning)
pyspark/sql/functions.py:    warnings.warn("Deprecated in 2.1, use approx_count_distinct instead.", DeprecationWarning)
pyspark/sql/pandas/conversion.py:                    warnings.warn(msg)
pyspark/sql/pandas/conversion.py:                    warnings.warn(msg)
pyspark/sql/pandas/conversion.py:                    warnings.warn(msg)
pyspark/sql/pandas/conversion.py:                    warnings.warn(msg)
pyspark/sql/pandas/conversion.py:                    warnings.warn(msg)
pyspark/sql/pandas/functions.py:        warnings.warn(
pyspark/sql/pandas/group_ops.py:        warnings.warn(
pyspark/sql/session.py:                warnings.warn("Fall back to non-hive support because failing to access HiveConf, "
{code}

PySpark prints warnings via using {{print}} in some places as well. We should also see if we should switch and replace to {{warnings.warn}}.

  was:
We should use warnings properly per https://docs.python.org/3/library/warnings.html#warning-categories

In particular, we should use {{FutureWarning}} instead of {{DeprecationWarning}} if we aim to show the warnings by default.

Current warnings are a bit messy and somewhat arbitrary.


> Standardize warning types
> -------------------------
>
>                 Key: SPARK-33730
>                 URL: https://issues.apache.org/jira/browse/SPARK-33730
>             Project: Spark
>          Issue Type: Sub-task
>          Components: PySpark
>    Affects Versions: 3.1.0
>            Reporter: Hyukjin Kwon
>            Priority: Major
>
> We should use warnings properly per https://docs.python.org/3/library/warnings.html#warning-categories
> In particular, we should use {{FutureWarning}} instead of {{DeprecationWarning}} if we aim to show the warnings by default.
> Current warnings are a bit messy and somewhat arbitrary.
> To be more explicit, we'll have to fix:
> {code}
> pyspark/cloudpickle/cloudpickle.py:    warnings.warn(
> pyspark/context.py:                    warnings.warn(
> pyspark/context.py:                warnings.warn(
> pyspark/ml/classification.py:                warnings.warn("weightCol is ignored, "
> pyspark/ml/clustering.py:        warnings.warn("Deprecated in 3.0.0. It will be removed in future versions. Use "
> pyspark/mllib/classification.py:        warnings.warn(
> pyspark/mllib/feature.py:            warnings.warn("Both withMean and withStd are false. The model does nothing.")
> pyspark/mllib/regression.py:        warnings.warn(
> pyspark/mllib/regression.py:        warnings.warn(
> pyspark/mllib/regression.py:        warnings.warn(
> pyspark/rdd.py:        warnings.warn("mapPartitionsWithSplit is deprecated; "
> pyspark/rdd.py:        warnings.warn(
> pyspark/shell.py:    warnings.warn("Failed to initialize Spark session.")
> pyspark/shuffle.py:            warnings.warn("Please install psutil to have better "
> pyspark/sql/catalog.py:        warnings.warn(
> pyspark/sql/catalog.py:        warnings.warn(
> pyspark/sql/column.py:            warnings.warn(
> pyspark/sql/column.py:            warnings.warn(
> pyspark/sql/context.py:            warnings.warn(
> pyspark/sql/context.py:        warnings.warn(
> pyspark/sql/context.py:        warnings.warn(
> pyspark/sql/context.py:        warnings.warn(
> pyspark/sql/context.py:        warnings.warn(
> pyspark/sql/dataframe.py:        warnings.warn(
> pyspark/sql/dataframe.py:                warnings.warn("to_replace is a dict and value is not None. value will be ignored.")
> pyspark/sql/functions.py:    warnings.warn("Deprecated in 2.1, use degrees instead.", DeprecationWarning)
> pyspark/sql/functions.py:    warnings.warn("Deprecated in 2.1, use radians instead.", DeprecationWarning)
> pyspark/sql/functions.py:    warnings.warn("Deprecated in 2.1, use approx_count_distinct instead.", DeprecationWarning)
> pyspark/sql/pandas/conversion.py:                    warnings.warn(msg)
> pyspark/sql/pandas/conversion.py:                    warnings.warn(msg)
> pyspark/sql/pandas/conversion.py:                    warnings.warn(msg)
> pyspark/sql/pandas/conversion.py:                    warnings.warn(msg)
> pyspark/sql/pandas/conversion.py:                    warnings.warn(msg)
> pyspark/sql/pandas/functions.py:        warnings.warn(
> pyspark/sql/pandas/group_ops.py:        warnings.warn(
> pyspark/sql/session.py:                warnings.warn("Fall back to non-hive support because failing to access HiveConf, "
> {code}
> PySpark prints warnings via using {{print}} in some places as well. We should also see if we should switch and replace to {{warnings.warn}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org