You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "beliefer (via GitHub)" <gi...@apache.org> on 2023/03/23 02:36:31 UTC

[GitHub] [spark] beliefer opened a new pull request, #40528: [WIP][SPARK-42584][CONNECT] Improve output of Column.explain

beliefer opened a new pull request, #40528:
URL: https://github.com/apache/spark/pull/40528

   ### What changes were proposed in this pull request?
   Currently, connect display the structure of the proto in both the regular and extended version of explain. We should display a more compact sql-a-like string for the regular version.
   
   
   ### Why are the changes needed?
   Improve output of `Column.explain` so as display the same as the explain API of sql expression.
   
   ### Does this PR introduce _any_ user-facing change?
   'No'.
   New feature.
   
   
   ### How was this patch tested?
   New test cases.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40528: [SPARK-42584][CONNECT] Improve output of Column.explain

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40528:
URL: https://github.com/apache/spark/pull/40528#discussion_r1145613769


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Column.scala:
##########
@@ -1213,11 +1213,8 @@ class Column private[sql] (private[sql] val expr: proto.Expression) extends Logg
    */
   def explain(extended: Boolean): Unit = {
     // scalastyle:off println
-    if (extended) {
-      println(expr)
-    } else {
-      println(toString)
-    }
+    val builder = new ProtoExpressionSQLBuilder(expr, extended)
+    println(builder.build())

Review Comment:
   @hvanhovell It looks like `extended` won't be useful anymore here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-42584][CONNECT] Improve output of Column.explain [spark]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #40528:
URL: https://github.com/apache/spark/pull/40528#issuecomment-1758724597

   We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #40528: [WIP][SPARK-42584][CONNECT] Improve output of Column.explain

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #40528:
URL: https://github.com/apache/spark/pull/40528#issuecomment-1480509383

   This PR has not been implemented yet.
   @hvanhovell Could you take a look? Does this one satisfy your expected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #40528: [SPARK-42584][CONNECT] Improve output of Column.explain

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #40528:
URL: https://github.com/apache/spark/pull/40528#issuecomment-1611319877

   ping @hvanhovell Could you take a look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #40528: [SPARK-42584][CONNECT] Improve output of Column.explain

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #40528:
URL: https://github.com/apache/spark/pull/40528#issuecomment-1484985770

   ping @hvanhovell 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #40528: [SPARK-42584][CONNECT] Improve output of Column.explain

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #40528:
URL: https://github.com/apache/spark/pull/40528#issuecomment-1591192310

   @hvanhovell Could you take a review ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #40528: [SPARK-42584][CONNECT] Improve output of Column.explain

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #40528:
URL: https://github.com/apache/spark/pull/40528#issuecomment-1517382928

   ping @hvanhovell Could you have time to take a look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-42584][CONNECT] Improve output of Column.explain [spark]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed pull request #40528: [SPARK-42584][CONNECT] Improve output of Column.explain
URL: https://github.com/apache/spark/pull/40528


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #40528: [SPARK-42584][CONNECT] Improve output of Column.explain

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #40528:
URL: https://github.com/apache/spark/pull/40528#issuecomment-1558462353

   @hvanhovell Could you take a look again?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #40528: [SPARK-42584][CONNECT] Improve output of Column.explain

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #40528:
URL: https://github.com/apache/spark/pull/40528#discussion_r1250433691


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/expressions/ProtoExpressionSQLBuilder.scala:
##########
@@ -0,0 +1,213 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.expressions
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.connect.common.{DataTypeProtoConverter, InvalidPlanInput, LiteralValueProtoConverter}
+
+/**
+ * The builder to generate SQL from proto expressions.
+ */
+class ProtoExpressionSQLBuilder(expr: proto.Expression, extended: Boolean) {
+
+  def build(): String = generate(expr)
+
+  private def generate(expr: proto.Expression): String = {
+    expr.getExprTypeCase match {
+      case proto.Expression.ExprTypeCase.LITERAL => visitLiteral(expr.getLiteral)
+      case proto.Expression.ExprTypeCase.UNRESOLVED_ATTRIBUTE =>
+        visitUnresolvedAttribute(expr.getUnresolvedAttribute)
+      case proto.Expression.ExprTypeCase.UNRESOLVED_FUNCTION =>
+        visitUnresolvedFunction(expr.getUnresolvedFunction)
+      case proto.Expression.ExprTypeCase.UNRESOLVED_EXTRACT_VALUE =>
+        visitUnresolvedExtractValue(expr.getUnresolvedExtractValue)
+      case proto.Expression.ExprTypeCase.UPDATE_FIELDS =>
+        visitUpdateFields(expr.getUpdateFields)
+      case proto.Expression.ExprTypeCase.ALIAS => visitAlias(expr.getAlias)
+      case proto.Expression.ExprTypeCase.CAST =>
+        val cast = expr.getCast
+        cast.getCastToTypeCase match {
+          case proto.Expression.Cast.CastToTypeCase.TYPE =>
+            visitCast(expr.getCast)
+          case _ => visitUnexpectedExpr(expr)
+        }
+      case proto.Expression.ExprTypeCase.SORT_ORDER => visitSortOrder(expr.getSortOrder)
+      case _ =>
+        throw InvalidPlanInput(
+          s"Expression with ID: ${expr.getExprTypeCase.getNumber} is not supported")
+    }
+  }
+
+  private def visitLiteral(literal: proto.Expression.Literal): String = {
+    val value = LiteralValueProtoConverter.toCatalystValue(literal)
+    literal.getLiteralTypeCase match {
+      case proto.Expression.Literal.LiteralTypeCase.STRING =>
+        s"'$value'"
+      case _ =>
+        s"$value"
+    }
+  }
+
+  private def visitUnresolvedAttribute(attr: proto.Expression.UnresolvedAttribute): String =
+    attr.getUnparsedIdentifier
+
+  private def visitUnresolvedFunction(fun: proto.Expression.UnresolvedFunction): String = {
+    val functionName = fun.getFunctionName
+    functionName match {
+      case "=" | "<=>" | "<" | "<=" | ">" | ">=" =>
+        visitBinaryComparison(
+          functionName,
+          generate(fun.getArguments(0)),
+          generate(fun.getArguments(1)))
+      case "+" | "-" | "*" | "/" | "%" | "&" | "|" | "^" =>
+        visitBinaryArithmetic(
+          functionName,
+          generate(fun.getArguments(0)),
+          generate(fun.getArguments(1)))
+      case "and" | "or" =>
+        visitOr(functionName, generate(fun.getArguments(0)), generate(fun.getArguments(1)))
+      case "negative" =>
+        visitUnaryArithmetic("-", generate(fun.getArguments(0)))
+      case "!" =>
+        visitUnaryArithmetic("NOT", generate(fun.getArguments(0)))
+      case "when" =>
+        visitCaseWhen(fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      case "isNull" =>
+        visitIsNull(generate(fun.getArguments(0)))
+      case "isNotNull" =>
+        visitIsNotNull(generate(fun.getArguments(0)))
+      case "in" =>
+        visitJoin(fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      case "like" =>
+        visitLike(generate(fun.getArguments(0)), generate(fun.getArguments(1)))
+      case "isNaN" =>
+        visitCommonFunction("isnan", fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      case "rlike" =>
+        visitCommonFunction("RLIKE", fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      case "substr" =>
+        visitCommonFunction("substring", fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      case "ilike" | "contains" | "startswith" | "endswith" =>
+        visitCommonFunction(functionName, fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      // TODO: Support more functions
+    }
+  }
+
+  private def visitBinaryComparison(name: String, l: String, r: String): String =
+    s"($l $name $r)"

Review Comment:
   I think this is too much for cosmetic changes, and even with this it can't address all the cases, right? e.g., unresolved column such as `a AS b`. In non Spark Connect, it will be shown `b` IIRC.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40528: [WIP][SPARK-42584][CONNECT] Improve output of Column.explain

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40528:
URL: https://github.com/apache/spark/pull/40528#discussion_r1145613769


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Column.scala:
##########
@@ -1213,11 +1213,8 @@ class Column private[sql] (private[sql] val expr: proto.Expression) extends Logg
    */
   def explain(extended: Boolean): Unit = {
     // scalastyle:off println
-    if (extended) {
-      println(expr)
-    } else {
-      println(toString)
-    }
+    val builder = new ProtoExpressionSQLBuilder(expr, extended)
+    println(builder.build())

Review Comment:
   @hvanhovell It looks like `extended` won't be useful anymore.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40528: [SPARK-42584][CONNECT] Improve output of Column.explain

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40528:
URL: https://github.com/apache/spark/pull/40528#discussion_r1250596220


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/expressions/ProtoExpressionSQLBuilder.scala:
##########
@@ -0,0 +1,213 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.expressions
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.connect.common.{DataTypeProtoConverter, InvalidPlanInput, LiteralValueProtoConverter}
+
+/**
+ * The builder to generate SQL from proto expressions.
+ */
+class ProtoExpressionSQLBuilder(expr: proto.Expression, extended: Boolean) {
+
+  def build(): String = generate(expr)
+
+  private def generate(expr: proto.Expression): String = {
+    expr.getExprTypeCase match {
+      case proto.Expression.ExprTypeCase.LITERAL => visitLiteral(expr.getLiteral)
+      case proto.Expression.ExprTypeCase.UNRESOLVED_ATTRIBUTE =>
+        visitUnresolvedAttribute(expr.getUnresolvedAttribute)
+      case proto.Expression.ExprTypeCase.UNRESOLVED_FUNCTION =>
+        visitUnresolvedFunction(expr.getUnresolvedFunction)
+      case proto.Expression.ExprTypeCase.UNRESOLVED_EXTRACT_VALUE =>
+        visitUnresolvedExtractValue(expr.getUnresolvedExtractValue)
+      case proto.Expression.ExprTypeCase.UPDATE_FIELDS =>
+        visitUpdateFields(expr.getUpdateFields)
+      case proto.Expression.ExprTypeCase.ALIAS => visitAlias(expr.getAlias)
+      case proto.Expression.ExprTypeCase.CAST =>
+        val cast = expr.getCast
+        cast.getCastToTypeCase match {
+          case proto.Expression.Cast.CastToTypeCase.TYPE =>
+            visitCast(expr.getCast)
+          case _ => visitUnexpectedExpr(expr)
+        }
+      case proto.Expression.ExprTypeCase.SORT_ORDER => visitSortOrder(expr.getSortOrder)
+      case _ =>
+        throw InvalidPlanInput(
+          s"Expression with ID: ${expr.getExprTypeCase.getNumber} is not supported")
+    }
+  }
+
+  private def visitLiteral(literal: proto.Expression.Literal): String = {
+    val value = LiteralValueProtoConverter.toCatalystValue(literal)
+    literal.getLiteralTypeCase match {
+      case proto.Expression.Literal.LiteralTypeCase.STRING =>
+        s"'$value'"
+      case _ =>
+        s"$value"
+    }
+  }
+
+  private def visitUnresolvedAttribute(attr: proto.Expression.UnresolvedAttribute): String =
+    attr.getUnparsedIdentifier
+
+  private def visitUnresolvedFunction(fun: proto.Expression.UnresolvedFunction): String = {
+    val functionName = fun.getFunctionName
+    functionName match {
+      case "=" | "<=>" | "<" | "<=" | ">" | ">=" =>
+        visitBinaryComparison(
+          functionName,
+          generate(fun.getArguments(0)),
+          generate(fun.getArguments(1)))
+      case "+" | "-" | "*" | "/" | "%" | "&" | "|" | "^" =>
+        visitBinaryArithmetic(
+          functionName,
+          generate(fun.getArguments(0)),
+          generate(fun.getArguments(1)))
+      case "and" | "or" =>
+        visitOr(functionName, generate(fun.getArguments(0)), generate(fun.getArguments(1)))
+      case "negative" =>
+        visitUnaryArithmetic("-", generate(fun.getArguments(0)))
+      case "!" =>
+        visitUnaryArithmetic("NOT", generate(fun.getArguments(0)))
+      case "when" =>
+        visitCaseWhen(fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      case "isNull" =>
+        visitIsNull(generate(fun.getArguments(0)))
+      case "isNotNull" =>
+        visitIsNotNull(generate(fun.getArguments(0)))
+      case "in" =>
+        visitJoin(fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      case "like" =>
+        visitLike(generate(fun.getArguments(0)), generate(fun.getArguments(1)))
+      case "isNaN" =>
+        visitCommonFunction("isnan", fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      case "rlike" =>
+        visitCommonFunction("RLIKE", fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      case "substr" =>
+        visitCommonFunction("substring", fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      case "ilike" | "contains" | "startswith" | "endswith" =>
+        visitCommonFunction(functionName, fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      // TODO: Support more functions
+    }
+  }
+
+  private def visitBinaryComparison(name: String, l: String, r: String): String =
+    s"($l $name $r)"

Review Comment:
   Yeah. we can add each case on the way. This implementation based on the discussion https://github.com/apache/spark/pull/40467#discussion_r1144196759



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on a diff in pull request #40528: [SPARK-42584][CONNECT] Improve output of Column.explain

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #40528:
URL: https://github.com/apache/spark/pull/40528#discussion_r1250598446


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/expressions/ProtoExpressionSQLBuilder.scala:
##########
@@ -0,0 +1,213 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.expressions
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.connect.common.{DataTypeProtoConverter, InvalidPlanInput, LiteralValueProtoConverter}
+
+/**
+ * The builder to generate SQL from proto expressions.
+ */
+class ProtoExpressionSQLBuilder(expr: proto.Expression, extended: Boolean) {
+
+  def build(): String = generate(expr)
+
+  private def generate(expr: proto.Expression): String = {
+    expr.getExprTypeCase match {
+      case proto.Expression.ExprTypeCase.LITERAL => visitLiteral(expr.getLiteral)
+      case proto.Expression.ExprTypeCase.UNRESOLVED_ATTRIBUTE =>
+        visitUnresolvedAttribute(expr.getUnresolvedAttribute)
+      case proto.Expression.ExprTypeCase.UNRESOLVED_FUNCTION =>
+        visitUnresolvedFunction(expr.getUnresolvedFunction)
+      case proto.Expression.ExprTypeCase.UNRESOLVED_EXTRACT_VALUE =>
+        visitUnresolvedExtractValue(expr.getUnresolvedExtractValue)
+      case proto.Expression.ExprTypeCase.UPDATE_FIELDS =>
+        visitUpdateFields(expr.getUpdateFields)
+      case proto.Expression.ExprTypeCase.ALIAS => visitAlias(expr.getAlias)
+      case proto.Expression.ExprTypeCase.CAST =>
+        val cast = expr.getCast
+        cast.getCastToTypeCase match {
+          case proto.Expression.Cast.CastToTypeCase.TYPE =>
+            visitCast(expr.getCast)
+          case _ => visitUnexpectedExpr(expr)
+        }
+      case proto.Expression.ExprTypeCase.SORT_ORDER => visitSortOrder(expr.getSortOrder)
+      case _ =>
+        throw InvalidPlanInput(
+          s"Expression with ID: ${expr.getExprTypeCase.getNumber} is not supported")
+    }
+  }
+
+  private def visitLiteral(literal: proto.Expression.Literal): String = {
+    val value = LiteralValueProtoConverter.toCatalystValue(literal)
+    literal.getLiteralTypeCase match {
+      case proto.Expression.Literal.LiteralTypeCase.STRING =>
+        s"'$value'"
+      case _ =>
+        s"$value"
+    }
+  }
+
+  private def visitUnresolvedAttribute(attr: proto.Expression.UnresolvedAttribute): String =
+    attr.getUnparsedIdentifier
+
+  private def visitUnresolvedFunction(fun: proto.Expression.UnresolvedFunction): String = {
+    val functionName = fun.getFunctionName
+    functionName match {
+      case "=" | "<=>" | "<" | "<=" | ">" | ">=" =>
+        visitBinaryComparison(
+          functionName,
+          generate(fun.getArguments(0)),
+          generate(fun.getArguments(1)))
+      case "+" | "-" | "*" | "/" | "%" | "&" | "|" | "^" =>
+        visitBinaryArithmetic(
+          functionName,
+          generate(fun.getArguments(0)),
+          generate(fun.getArguments(1)))
+      case "and" | "or" =>
+        visitOr(functionName, generate(fun.getArguments(0)), generate(fun.getArguments(1)))
+      case "negative" =>
+        visitUnaryArithmetic("-", generate(fun.getArguments(0)))
+      case "!" =>
+        visitUnaryArithmetic("NOT", generate(fun.getArguments(0)))
+      case "when" =>
+        visitCaseWhen(fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      case "isNull" =>
+        visitIsNull(generate(fun.getArguments(0)))
+      case "isNotNull" =>
+        visitIsNotNull(generate(fun.getArguments(0)))
+      case "in" =>
+        visitJoin(fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      case "like" =>
+        visitLike(generate(fun.getArguments(0)), generate(fun.getArguments(1)))
+      case "isNaN" =>
+        visitCommonFunction("isnan", fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      case "rlike" =>
+        visitCommonFunction("RLIKE", fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      case "substr" =>
+        visitCommonFunction("substring", fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      case "ilike" | "contains" | "startswith" | "endswith" =>
+        visitCommonFunction(functionName, fun.getArgumentsList.asScala.toSeq.map(generate(_)))
+      // TODO: Support more functions
+    }
+  }
+
+  private def visitBinaryComparison(name: String, l: String, r: String): String =
+    s"($l $name $r)"

Review Comment:
   The current implementation doesn't guarantee the coverage for all the cases. But we can add them in following PRs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org