You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Dongjoon Hyun (Jira)" <ji...@apache.org> on 2020/03/16 22:54:10 UTC

[jira] [Updated] (SPARK-27669) Refactor DataFrameWriter to resolve datasources in a command

     [ https://issues.apache.org/jira/browse/SPARK-27669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dongjoon Hyun updated SPARK-27669:
----------------------------------
    Affects Version/s:     (was: 3.0.0)
                       3.1.0

> Refactor DataFrameWriter to resolve datasources in a command
> ------------------------------------------------------------
>
>                 Key: SPARK-27669
>                 URL: https://issues.apache.org/jira/browse/SPARK-27669
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.1.0
>            Reporter: Eric Liang
>            Priority: Major
>
> Currently, DataFrameWriter.save() does a large amount of ad-hoc analysis (e.g., loading data source classes, validating options, and so on) before executing the command.
> The execution of this code falls outside the scope of any SQL execution, which is unfortunate since it means it's untracked by Spark (e.g., in the Spark UI), and also means df.write ops cannot be manipulated by custom catalyst rules prior to execution.
> These issues can be largely resolved by creating a command that represents df.write.save/saveAsTable().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org