You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/10/09 16:17:22 UTC
[GitHub] [spark] rdblue commented on a change in pull request #26071: [SPARK-29412][SQL] refine the document of v2 session catalog config

rdblue commented on a change in pull request #26071: [SPARK-29412][SQL] refine the document of v2 session catalog config
URL: https://github.com/apache/spark/pull/26071#discussion_r333107075
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
 ##########
 @@ -1976,11 +1977,19 @@ object SQLConf {
     .stringConf
     .createOptional
 
-  val V2_SESSION_CATALOG = buildConf("spark.sql.catalog.session")
-      .doc("A catalog implementation that will be used in place of the Spark built-in session " +
-        "catalog for v2 operations. The implementation may extend `CatalogExtension` to be " +
-        "passed the Spark built-in session catalog, so that it may delegate calls to the " +
-        "built-in session catalog.")
+  val V2_SESSION_CATALOG_IMPLEMENTATION =
+    buildConf(s"spark.sql.catalog.${CatalogManager.SESSION_CATALOG_NAME}")
+      .doc("A catalog implementation that will be used in place of the Spark Catalog for v2 " +
+        "operations (e.g. create table using a v2 source, alter a v2 table). The Spark Catalog " +
+        "is the current catalog by default, and supports all kinds of catalog operations like " +
+        "CREATE TABLE USING v1/v2 source, VIEW/FUNCTION related operations, etc. This config is " +
+        "used to extend the Spark Catalog and inject custom logic to v2 operations, while other" +
+        "operations still go through the Spark Catalog. The catalog implementation specified " +
+        "by this config should extend `CatalogExtension` to be passed the Spark Catalog, " +
+        "so that it can delegate calls to Spark Catalog. Otherwise, the implementation " +
+        "should figure out a way to access the Spark Catalog or its underlying meta-store " +
+        "by itself. It's important to make the implementation share the underlying meta-store " +
+        "of the Spark Catalog and act as an extension, instead of a separated catalog.")
 
 Review comment:
   I think you've identified the right things to point out:
   
   * This controls the implementation that is a v2 interface to the built-in v1 catalog
   * This catalog and the built-in v1 catalog share an identifier namespace and must be consistent
   * To delegate to the default implementation, use `CatalogExtension`
   
   I'd change the wording to be a bit shorter, though. This doesn't need to explain what the catalog interface does, or what the built-in catalog can do. How about this?
   
   > A catalog implementation that will be used as the v2 interface to Spark's built-in v1 catalog, spark_catalog. This catalog shares its identifier namespace with the v1 Spark catalog and must be consistent with it; e.g., if a table can be loaded by the v1 catalog, this catalog must also return the table metadata. To delegate operations to the built-in catalog, implementations can extend `CatalogExtension`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org