You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by yh...@apache.org on 2015/12/10 03:09:42 UTC
spark git commit: [SPARK-11678][SQL][DOCS] Document basePath in the programming guide.

Repository: spark
Updated Branches:
  refs/heads/master 8770bd121 -> ac8cdf1cd


[SPARK-11678][SQL][DOCS] Document basePath in the programming guide.

This PR adds document for `basePath`, which is a new parameter used by `HadoopFsRelation`.

The compiled doc is shown below.
![image](https://cloud.githubusercontent.com/assets/2072857/11673132/1ba01192-9dcb-11e5-98d9-ac0b4e92e98c.png)

JIRA: https://issues.apache.org/jira/browse/SPARK-11678

Author: Yin Huai <yh...@databricks.com>

Closes #10211 from yhuai/basePathDoc.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ac8cdf1c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ac8cdf1c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ac8cdf1c

Branch: refs/heads/master
Commit: ac8cdf1cdc148bd21290ecf4d4f9874f8c87cc14
Parents: 8770bd1
Author: Yin Huai <yh...@databricks.com>
Authored: Wed Dec 9 18:09:36 2015 -0800
Committer: Yin Huai <yh...@databricks.com>
Committed: Wed Dec 9 18:09:36 2015 -0800

----------------------------------------------------------------------
 docs/sql-programming-guide.md | 7 +++++++
 1 file changed, 7 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/ac8cdf1c/docs/sql-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index 9f87acc..3f9a831 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1233,6 +1233,13 @@ infer the data types of the partitioning columns. For these use cases, the autom
 can be configured by `spark.sql.sources.partitionColumnTypeInference.enabled`, which is default to
 `true`. When type inference is disabled, string type will be used for the partitioning columns.
 
+Starting from Spark 1.6.0, partition discovery only finds partitions under the given paths
+by default. For the above example, if users pass `path/to/table/gender=male` to either 
+`SQLContext.read.parquet` or `SQLContext.read.load`, `gender` will not be considered as a
+partitioning column. If users need to specify the base path that partition discovery
+should start with, they can set `basePath` in the data source options. For example,
+when `path/to/table/gender=male` is the path of the data and
+users set `basePath` to `path/to/table/`, `gender` will be a partitioning column.
 
 ### Schema Merging
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org