You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sathis Kumar (JIRA)" <ji...@apache.org> on 2016/03/14 14:17:33 UTC
[jira] [Commented] (SPARK-13730) Nulls in dataframes getting
converted to 0 with spark 2.0 SNAPSHOT
[ https://issues.apache.org/jira/browse/SPARK-13730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193266#comment-15193266 ]
Sathis Kumar commented on SPARK-13730:
--------------------------------------
This issue is duplicate of SPARK-12323.
> Nulls in dataframes getting converted to 0 with spark 2.0 SNAPSHOT
> ------------------------------------------------------------------
>
> Key: SPARK-13730
> URL: https://issues.apache.org/jira/browse/SPARK-13730
> Project: Spark
> Issue Type: Bug
> Components: PySpark, SQL
> Affects Versions: 2.0.0
> Reporter: Franklyn Dsouza
> Priority: Critical
>
> Basically I'm putting nulls into a non-nullable LongType column and doing a transformation operation on that column, the result is a column with nulls converted to 0. Have Not tested this in Scala.
> Heres an example
> {code}
> from pyspark.sql import types
> from pyspark.sql import DataFrame, types, functions as F
> sql_schema = types.StructType([
> types.StructField("a", types.LongType(), True),
> types.StructField("b", types.StringType(), True),
> ])
> df = sqlCtx.createDataFrame([
> (1, "one"),
> (None, "two"),
> ], sql_schema)
> # Everything is fine here
> df.collect() # [Row(a=1, b=u'one'), Row(a=None, b=u'two')]
> def assert_not_null(val):
> return val
> udf = F.udf(assert_not_null, types.LongType())
> df = df.withColumnRenamed('a', "_tmp_col")
> df = df.withColumn('a', udf(df._tmp_col))
> df = df.drop("_tmp_col")
> # None gets converted to 0
> df.collect() # [Row(b=u'one', a=1), Row(b=u'two', a=0)]
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org