You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by rx...@apache.org on 2016/12/07 03:03:26 UTC

spark git commit: Update Spark documentation to provide information on how to create External Table

Repository: spark
Updated Branches:
  refs/heads/master 539bb3cf9 -> 01c7c6b88


Update Spark documentation to provide information on how to create External Table

## What changes were proposed in this pull request?
Although, currently, the saveAsTable does not provide an API to save the table as an external table from a DataFrame, we can achieve this functionality by using options on DataFrameWriter where the key for the map is the String: "path" and the value is another String which is the location of the external table itself. This can be provided before the call to saveAsTable is performed.

## How was this patch tested?
Documentation was reviewed for formatting and content after the push was performed on the branch.
![updated documentation](https://cloud.githubusercontent.com/assets/15376052/20953147/4cfcf308-bc57-11e6-807c-e21fb774a760.PNG)

Author: c-sahuja <sa...@cloudera.com>

Closes #16185 from c-sahuja/createExternalTable.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/01c7c6b8
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/01c7c6b8
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/01c7c6b8

Branch: refs/heads/master
Commit: 01c7c6b884244ac1a57e332c3aea669488ad9dc0
Parents: 539bb3c
Author: c-sahuja <sa...@cloudera.com>
Authored: Tue Dec 6 19:03:23 2016 -0800
Committer: Reynold Xin <rx...@databricks.com>
Committed: Tue Dec 6 19:03:23 2016 -0800

----------------------------------------------------------------------
 docs/sql-programming-guide.md | 5 +++++
 1 file changed, 5 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/01c7c6b8/docs/sql-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index e59c327..6287e2b 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -526,6 +526,11 @@ By default `saveAsTable` will create a "managed table", meaning that the locatio
 be controlled by the metastore. Managed tables will also have their data deleted automatically
 when a table is dropped.
 
+Currently, `saveAsTable` does not expose an API supporting the creation of an "External table" from a `DataFrame`, 
+however, this functionality can be achieved by providing a `path` option to the `DataFrameWriter` with `path` as the key 
+and location of the external table as its value (String) when saving the table with `saveAsTable`. When an External table 
+is dropped only its metadata is removed.
+
 ## Parquet Files
 
 [Parquet](http://parquet.io) is a columnar format that is supported by many other data processing systems.


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org