You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sathis Kumar (JIRA)" <ji...@apache.org> on 2016/03/14 14:17:33 UTC

[jira] [Commented] (SPARK-13730) Nulls in dataframes getting converted to 0 with spark 2.0 SNAPSHOT

    [ https://issues.apache.org/jira/browse/SPARK-13730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193266#comment-15193266 ] 

Sathis Kumar commented on SPARK-13730:
--------------------------------------

This issue is duplicate of SPARK-12323.

> Nulls in dataframes getting converted to 0 with spark 2.0 SNAPSHOT
> ------------------------------------------------------------------
>
>                 Key: SPARK-13730
>                 URL: https://issues.apache.org/jira/browse/SPARK-13730
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, SQL
>    Affects Versions: 2.0.0
>            Reporter: Franklyn Dsouza
>            Priority: Critical
>
> Basically I'm putting nulls into a non-nullable LongType column and doing a transformation operation on that column, the result is a column with nulls converted to 0. Have Not tested this in Scala.
> Heres an example 
> {code}
> from pyspark.sql import types
> from pyspark.sql import DataFrame, types, functions as F
> sql_schema = types.StructType([
>   types.StructField("a", types.LongType(), True),
>   types.StructField("b", types.StringType(),  True),
> ])
> df = sqlCtx.createDataFrame([
>     (1, "one"),
>     (None, "two"),
> ], sql_schema)
> # Everything is fine here
> df.collect() # [Row(a=1, b=u'one'), Row(a=None, b=u'two')]
> def assert_not_null(val):
>     return val
> udf = F.udf(assert_not_null, types.LongType())
> df = df.withColumnRenamed('a', "_tmp_col")
> df = df.withColumn('a', udf(df._tmp_col))
> df = df.drop("_tmp_col")
> # None gets converted to 0
> df.collect() # [Row(b=u'one', a=1), Row(b=u'two', a=0)]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org