You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Weichen Xu (Jira)" <ji...@apache.org> on 2021/09/08 01:37:00 UTC

[jira] [Resolved] (SPARK-36642) Add df.withMetadata: a syntax suger to update the metadata of a dataframe

     [ https://issues.apache.org/jira/browse/SPARK-36642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Weichen Xu resolved SPARK-36642.
--------------------------------
    Fix Version/s: 3.3.0
       Resolution: Fixed

Issue resolved by pull request 33853
[https://github.com/apache/spark/pull/33853]

> Add df.withMetadata: a syntax suger to update the metadata of a dataframe
> -------------------------------------------------------------------------
>
>                 Key: SPARK-36642
>                 URL: https://issues.apache.org/jira/browse/SPARK-36642
>             Project: Spark
>          Issue Type: Story
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: Liang Zhang
>            Assignee: Liang Zhang
>            Priority: Major
>             Fix For: 3.3.0
>
>
> To make it easy to use/modify the semantic annotation, we want to have a shorter API to update the metadata in a dataframe.
> Currently we have
> {code:scala}
> df.withColumn("col1", col("col1").alias("col1", metadata=metadata))
> {code}
> to update the metadata without changing the column name, and this is too verbose. We want to have a syntax suger API
> {code:scala}
> df.withMetadata("col1", metadata=metadata)
> {code}
> to achieve the same functionality.
> A bit of background for the frequency of the metadata update: We are working on inferring the semantic data types and use them in AutoML and store the semantic annotation in the metadata. So in many cases, we will suggest the user update the metadata to correct the wrong inference or manually add the annotation for weak inference.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org