You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Nicholas Chammas (JIRA)" <ji...@apache.org> on 2016/11/04 19:57:58 UTC

[jira] [Created] (SPARK-18277) na.fill() and friends should work on struct fields

Nicholas Chammas created SPARK-18277:
----------------------------------------

             Summary: na.fill() and friends should work on struct fields
                 Key: SPARK-18277
                 URL: https://issues.apache.org/jira/browse/SPARK-18277
             Project: Spark
          Issue Type: Improvement
          Components: SQL
            Reporter: Nicholas Chammas
            Priority: Minor


It appears that you cannot use {{fill()}} and friends to quickly modify struct fields.

For example:

{code}
>>> df = spark.createDataFrame([Row(a=Row(b='yeah yeah'), c='alright'), Row(a=Row(b=None), c=None)])
>>> df.printSchema()
root
 |-- a: struct (nullable = true)
 |    |-- b: string (nullable = true)
 |-- c: string (nullable = true)

>>> df.show()
+-----------+-------+
|          a|      c|
+-----------+-------+
|[yeah yeah]|alright|
|     [null]|   null|
+-----------+-------+

>>> df.na.fill('').show()
+-----------+-------+
|          a|      c|
+-----------+-------+
|[yeah yeah]|alright|
|     [null]|       |
+-----------+-------+
{code}

{{c}} got filled in, but {{a.b}} didn't.

I don't know if it's "appropriate", but it would be nice if {{fill()}} and friends worked automatically on struct fields.

As things are today, it appears that you have to manually unpack and rebuild structs to do things like fill in missing field values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org