You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Erik Parmann (Jira)" <ji...@apache.org> on 2022/03/15 10:12:00 UTC

[jira] [Created] (SPARK-38554) Dataframe API: withColumn "comment" parameter

Erik Parmann created SPARK-38554:
------------------------------------

             Summary: Dataframe API: withColumn "comment" parameter
                 Key: SPARK-38554
                 URL: https://issues.apache.org/jira/browse/SPARK-38554
             Project: Spark
          Issue Type: Wish
          Components: PySpark, Spark Core
    Affects Versions: 3.2.1
            Reporter: Erik Parmann


I often find that the right time to document a column in a dataframe is when I create it. It would be nice if withColumn took an optional comment parameter, then one could write e.g.

 
{code:java}
df = df.withColumn("tax", F.col("salary")*F.col("tax_percentage"), comment="The amount of tax payed in dollars.")
{code}
It is possible to do something similiar with alias, but as far as I know the equivalent would be the much more clunky:
{code:java}
df = df.withColumn("tax", F.col("salary")*F.col("tax_percentage")).alias("tax", metadata={"comment": "The amount of tax payed in dollars."}) {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org