You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Davies Liu (JIRA)" <ji...@apache.org> on 2016/04/22 18:20:12 UTC
[jira] [Resolved] (SPARK-13266) Python DataFrameReader converts
None to "None" instead of null
[ https://issues.apache.org/jira/browse/SPARK-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Davies Liu resolved SPARK-13266.
--------------------------------
Resolution: Fixed
Fix Version/s: 2.0.0
Issue resolved by pull request 12494
[https://github.com/apache/spark/pull/12494]
> Python DataFrameReader converts None to "None" instead of null
> --------------------------------------------------------------
>
> Key: SPARK-13266
> URL: https://issues.apache.org/jira/browse/SPARK-13266
> Project: Spark
> Issue Type: Bug
> Components: PySpark, SQL
> Affects Versions: 1.6.0
> Environment: Linux standalone but probably applies to all
> Reporter: mathieu longtin
> Labels: easyfix, patch
> Fix For: 2.0.0
>
>
> If you do something like this:
> {code:none}
> tsv_loader = sqlContext.read.format('com.databricks.spark.csv')
> tsv_loader.options(quote=None, escape=None)
> {code}
> The loader sees the string "None" as the _quote_ and _escape_ options. The loader should get a _null_.
> An easy fix is to modify *python/pyspark/sql/readwriter.py* near the top, correct the _to_str_ function. Here's the patch:
> {code:none}
> diff --git a/python/pyspark/sql/readwriter.py b/python/pyspark/sql/readwriter.py
> index a3d7eca..ba18d13 100644
> --- a/python/pyspark/sql/readwriter.py
> +++ b/python/pyspark/sql/readwriter.py
> @@ -33,10 +33,12 @@ __all__ = ["DataFrameReader", "DataFrameWriter"]
> def to_str(value):
> """
> - A wrapper over str(), but convert bool values to lower case string
> + A wrapper over str(), but convert bool values to lower case string, and keep None
> """
> if isinstance(value, bool):
> return str(value).lower()
> + elif value is None:
> + return value
> else:
> return str(value)
> {code}
> This has been tested and works great.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org