You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/11/28 02:43:09 UTC

[GitHub] [spark] hvanhovell commented on a diff in pull request #38807: [SPARK-41270][CONNECT] Add Catalog tableExists and databaseExists in Connect proto

hvanhovell commented on code in PR #38807:
URL: https://github.com/apache/spark/pull/38807#discussion_r1033075780


##########
connector/connect/src/main/protobuf/spark/connect/base.proto:
##########
@@ -193,5 +233,8 @@ service SparkConnectService {
 
   // Analyzes a query and returns a [[AnalyzeResponse]] containing metadata about the query.
   rpc AnalyzePlan(AnalyzePlanRequest) returns (AnalyzePlanResponse) {}
+
+  // Execute a catalog operation and returns a [[CatalogResponse]] which contains results if there is any
+  rpc Catalog(CatalogRequest) returns (CatalogResponse) {}

Review Comment:
   Typically the name of an RPC is/contains a verb, you are executing something hence the verb. It tends to be a smell if it isn't. In this particular case I understand you are trying to model all catalog operations through this single RPC. I guess that is possible, but I wonder whether is worth the amount of clutter it introduces in the CatalogRequest/CatalogResponse methods.
   
   Another question I have would be how we are doing on efforts to create an information schema. If we have concrete plans and they are ready soon we should consider using that instead.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org