You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Maxim Gekk (Jira)" <ji...@apache.org> on 2020/07/30 18:23:00 UTC

[jira] [Created] (SPARK-32501) Inconsistent NULL conversions to strings

Maxim Gekk created SPARK-32501:
----------------------------------

             Summary: Inconsistent NULL conversions to strings 
                 Key: SPARK-32501
                 URL: https://issues.apache.org/jira/browse/SPARK-32501
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.1.0
            Reporter: Maxim Gekk


1. It is impossible to distinguish empty string and null, for instance:
{code:scala}
scala> Seq(Seq(""), Seq(null)).toDF().show
+-----+
|value|
+-----+
| []|
| []|
+-----+
{code}
2. Inconsistent NULL conversions for top-level values and nested columns, for instance:
{code:scala}
scala> sql("select named_struct('c', null), null").show
+---------------------+----+
|named_struct(c, NULL)|NULL|
+---------------------+----+
| []|null|
+---------------------+----+
{code}
3. `.show()` is different from conversions to Hive strings, and as a consequence its output is different from `spark-sql` (sql tests):
{code:sql}
spark-sql> select named_struct('c', null) as struct;
{"c":null}
{code}
{code:scala}
scala> sql("select named_struct('c', null) as struct").show
+------+
|struct|
+------+
| []|
+------+
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org