You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hu Fuwang (Jira)" <ji...@apache.org> on 2019/10/28 05:12:00 UTC

[jira] [Resolved] (SPARK-29615) Add insertInto method with byName parameter in DataFrameWriter

     [ https://issues.apache.org/jira/browse/SPARK-29615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hu Fuwang resolved SPARK-29615.
-------------------------------
    Resolution: Not A Problem

> Add insertInto method with byName parameter in DataFrameWriter
> --------------------------------------------------------------
>
>                 Key: SPARK-29615
>                 URL: https://issues.apache.org/jira/browse/SPARK-29615
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Hu Fuwang
>            Priority: Major
>
> Currently, the insertion through DataFrameWriter.insertInto method ignores the column names and just uses position-based resolution. As DataFrameWriter only has one public insertInto method, spark users may not check the description of this method and assume Spark will match the columns by name. In such case, wrong column may be used as partition column, which may result in problem (eg. huge amount of files/folders may be created in hive table tmp location).
> We propose to add a new insertInto method in DataFrameWriter which has byName parameter for Spark user to specify whether match columns by name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org