You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/11/22 15:04:41 UTC

[GitHub] [spark] hvanhovell commented on a diff in pull request #38742: [SPARK-41216][CONNECT][PYTHON] Make AnalyzePlan support multiple analysis tasks And implement isLocal/isStreaming/printSchema/semanticHash/sameSemantics/inputFiles

hvanhovell commented on code in PR #38742:
URL: https://github.com/apache/spark/pull/38742#discussion_r1029446204


##########
connector/connect/src/main/protobuf/spark/connect/base.proto:
##########
@@ -100,18 +70,138 @@ message AnalyzePlanRequest {
   // logging purposes and will not be interpreted by the server.
   optional string client_type = 4;
 
-  // (Optional) Get the explain string of the plan.
-  Explain explain = 5;
+  repeated AnalysisTask tasks = 5;
+
+  message AnalysisTask {
+    oneof task {
+      // Get the schema
+      Schema schema = 1;
+
+      // Is local
+      IsLocal is_local = 2;
+
+      // Is Streaming
+      IsStreaming is_streaming = 3;
+
+      // Get the explain string of the plan.
+      Explain explain = 4;
+
+      // Get the tree string of the schema.
+      TreeString tree_string = 5;
+
+      // Get the input files.
+      InputFiles input_files = 6;
+
+      // Get the semantic hash
+      SemanticHash semantic_hash = 7;

Review Comment:
   Do we really want to expose this in connect? The problem is hash stability. The same client can connect to different spark versions and get different hashes for this same plan.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org