You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by ma...@apache.org on 2014/12/30 21:18:57 UTC

spark git commit: [SPARK-4930][SQL][DOCS]Update SQL programming guide, CACHE TABLE is eager

Repository: spark
Updated Branches:
  refs/heads/master f7a41a0e7 -> 2deac748b


[SPARK-4930][SQL][DOCS]Update SQL programming guide, CACHE TABLE is eager

`CACHE TABLE tbl` is now __eager__ by default not __lazy__

Author: luogankun <lu...@gmail.com>

Closes #3773 from luogankun/SPARK-4930 and squashes the following commits:

cc17b7d [luogankun] [SPARK-4930][SQL][DOCS]Update SQL programming guide, add CACHE [LAZY] TABLE [AS SELECT] ...
bffe0e8 [luogankun] [SPARK-4930][SQL][DOCS]Update SQL programming guide, CACHE TABLE tbl is eager


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2deac748
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2deac748
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2deac748

Branch: refs/heads/master
Commit: 2deac748b4e1245c2cb9bd43ad87c80d6d130a83
Parents: f7a41a0
Author: luogankun <lu...@gmail.com>
Authored: Tue Dec 30 12:18:55 2014 -0800
Committer: Michael Armbrust <mi...@databricks.com>
Committed: Tue Dec 30 12:18:55 2014 -0800

----------------------------------------------------------------------
 docs/sql-programming-guide.md | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/2deac748/docs/sql-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index 1b5fde9..729045b 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1007,12 +1007,11 @@ let user control table caching explicitly:
     CACHE TABLE logs_last_month;
     UNCACHE TABLE logs_last_month;
 
-**NOTE:** `CACHE TABLE tbl` is lazy, similar to `.cache` on an RDD. This command only marks `tbl` to ensure that
-partitions are cached when calculated but doesn't actually cache it until a query that touches `tbl` is executed.
-To force the table to be cached, you may simply count the table immediately after executing `CACHE TABLE`:
+**NOTE:** `CACHE TABLE tbl` is now __eager__ by default not __lazy__. Don’t need to trigger cache materialization manually anymore.
 
-    CACHE TABLE logs_last_month;
-    SELECT COUNT(1) FROM logs_last_month;
+Spark SQL newly introduced a statement to let user control table caching whether or not lazy since Spark 1.2.0:
+
+	CACHE [LAZY] TABLE [AS SELECT] ...
 
 Several caching related features are not supported yet:
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org