You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by we...@apache.org on 2017/03/16 00:20:56 UTC
spark git commit: [SPARK-19948] Document that saveAsTable uses catalog as source of truth for table existence.

Repository: spark
Updated Branches:
  refs/heads/master 7d734a658 -> 339b237dc


[SPARK-19948] Document that saveAsTable uses catalog as source of truth for table existence.

It is quirky behaviour that saveAsTable to e.g. a JDBC source with SaveMode other
than Overwrite will nevertheless overwrite the table in the external source,
if that table was not a catalog table.

Author: Juliusz Sompolski <ju...@databricks.com>

Closes #17289 from juliuszsompolski/saveAsTableDoc.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/339b237d
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/339b237d
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/339b237d

Branch: refs/heads/master
Commit: 339b237dc18d4367b0735236b4b8be2901fcad79
Parents: 7d734a6
Author: Juliusz Sompolski <ju...@databricks.com>
Authored: Thu Mar 16 08:20:47 2017 +0800
Committer: Wenchen Fan <we...@databricks.com>
Committed: Thu Mar 16 08:20:47 2017 +0800

----------------------------------------------------------------------
 .../src/main/scala/org/apache/spark/sql/DataFrameWriter.scala   | 5 +++++
 1 file changed, 5 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/339b237d/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
----------------------------------------------------------------------
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
index deaa800..3e975ef 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
@@ -337,6 +337,11 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) {
    *    +---+---+
    * }}}
    *
+   * In this method, save mode is used to determine the behavior if the data source table exists in
+   * Spark catalog. We will always overwrite the underlying data of data source (e.g. a table in
+   * JDBC data source) if the table doesn't exist in Spark catalog, and will always append to the
+   * underlying data of data source if the table already exists.
+   *
    * When the DataFrame is created from a non-partitioned `HadoopFsRelation` with a single input
    * path, and the data source provider can be mapped to an existing Hive builtin SerDe (i.e. ORC
    * and Parquet), the table is persisted in a Hive compatible format, which means other systems


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org