You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/12/07 03:21:22 UTC

[GitHub] [spark] zhengruifeng opened a new pull request, #38956: [DO_NOT_MERGE] Implement Column.{when, otherwise} and Function when with UnresolvedFunction

zhengruifeng opened a new pull request, #38956:
URL: https://github.com/apache/spark/pull/38956

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'core/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #38956: [DO_NOT_MERGE] Implement Column.{when, otherwise} and Function when with UnresolvedFunction

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on code in PR #38956:
URL: https://github.com/apache/spark/pull/38956#discussion_r1041716023


##########
python/pyspark/sql/connect/column.py:
##########
@@ -129,6 +140,53 @@ def name(self) -> str:
         ...
 
 
+class CaseWhen(Expression):
+    def __init__(
+        self, branches: Sequence[Tuple[Expression, Expression]], else_value: Optional[Expression]
+    ):
+
+        assert isinstance(branches, list)
+        for branch in branches:
+            assert (
+                isinstance(branch, tuple)
+                and len(branch) == 2
+                and all(isinstance(expr, Expression) for expr in branch)
+            )
+        self._branches = branches
+
+        if else_value is not None:
+            assert isinstance(else_value, Expression)
+
+        self._else_value = else_value
+
+    def to_plan(self, session: "SparkConnectClient") -> "proto.Expression":
+        # fun = proto.Expression()
+        # fun.unresolved_function.function_name = "when"
+        # for condition, value in self._branches:
+        #     fun.unresolved_function.arguments.extend(condition.to_plan(session))
+        #     fun.unresolved_function.arguments.extend(value.to_plan(session))
+        # if self._else_value is not None:
+        #     fun.unresolved_function.arguments.extend(self._else_value.to_plan(session))
+        # return fun
+
+        args: Sequence[Expression] = []
+        for condition, value in self._branches:
+            args.append(condition)
+            args.append(value)
+
+        if self._else_value is not None:
+            args.append(self._else_value)
+
+        unresolved_function = UnresolvedFunction(name="when", args=args)
+
+        return unresolved_function.to_plan(session)

Review Comment:
   both `cdf.select(CF.when(cdf.a == 0, 1.0))` and `cdf.select(CF.when(cdf.a == 0, 1.0).otherwise(2.0))` fail with this error:
   
   ```
   grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
           status = StatusCode.UNKNOWN
           details = "Invalid arguments for function when."
           debug_error_string = "{"created":"@1670381875.743425000","description":"Error received from peer ipv6:[::1]:15002","file":"src/core/lib/surface/call.cc","file_line":1064,"grpc_message":"Invalid arguments for function when.","grpc_status":2}"
   ```
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #38956: [DO_NOT_MERGE] Implement Column.{when, otherwise} and Function when with UnresolvedFunction

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on code in PR #38956:
URL: https://github.com/apache/spark/pull/38956#discussion_r1041717834


##########
python/pyspark/sql/connect/column.py:
##########
@@ -129,6 +140,53 @@ def name(self) -> str:
         ...
 
 
+class CaseWhen(Expression):
+    def __init__(
+        self, branches: Sequence[Tuple[Expression, Expression]], else_value: Optional[Expression]
+    ):
+
+        assert isinstance(branches, list)
+        for branch in branches:
+            assert (
+                isinstance(branch, tuple)
+                and len(branch) == 2
+                and all(isinstance(expr, Expression) for expr in branch)
+            )
+        self._branches = branches
+
+        if else_value is not None:
+            assert isinstance(else_value, Expression)
+
+        self._else_value = else_value
+
+    def to_plan(self, session: "SparkConnectClient") -> "proto.Expression":
+        # fun = proto.Expression()
+        # fun.unresolved_function.function_name = "when"
+        # for condition, value in self._branches:
+        #     fun.unresolved_function.arguments.extend(condition.to_plan(session))
+        #     fun.unresolved_function.arguments.extend(value.to_plan(session))
+        # if self._else_value is not None:
+        #     fun.unresolved_function.arguments.extend(self._else_value.to_plan(session))
+        # return fun
+
+        args: Sequence[Expression] = []
+        for condition, value in self._branches:
+            args.append(condition)
+            args.append(value)
+
+        if self._else_value is not None:
+            args.append(self._else_value)
+
+        unresolved_function = UnresolvedFunction(name="when", args=args)
+
+        return unresolved_function.to_plan(session)

Review Comment:
   let me add another constructor



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on a diff in pull request #38956: [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function `when` with `UnresolvedFunction`

Posted by GitBox <gi...@apache.org>.
grundprinzip commented on code in PR #38956:
URL: https://github.com/apache/spark/pull/38956#discussion_r1047174786


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala:
##########
@@ -161,6 +161,13 @@ case class CaseWhen(
     elseValue: Option[Expression] = None)
   extends ComplexTypeMergingExpression with ConditionalExpression {
 
+  def this(branches: Seq[Expression]) = this(

Review Comment:
   Does this need an additional test?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #38956: [DO_NOT_MERGE] Implement Column.{when, otherwise} and Function when with UnresolvedFunction

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on code in PR #38956:
URL: https://github.com/apache/spark/pull/38956#discussion_r1041716422


##########
python/pyspark/sql/connect/column.py:
##########
@@ -129,6 +140,53 @@ def name(self) -> str:
         ...
 
 
+class CaseWhen(Expression):
+    def __init__(
+        self, branches: Sequence[Tuple[Expression, Expression]], else_value: Optional[Expression]
+    ):
+
+        assert isinstance(branches, list)
+        for branch in branches:
+            assert (
+                isinstance(branch, tuple)
+                and len(branch) == 2
+                and all(isinstance(expr, Expression) for expr in branch)
+            )
+        self._branches = branches
+
+        if else_value is not None:
+            assert isinstance(else_value, Expression)
+
+        self._else_value = else_value
+
+    def to_plan(self, session: "SparkConnectClient") -> "proto.Expression":
+        # fun = proto.Expression()
+        # fun.unresolved_function.function_name = "when"
+        # for condition, value in self._branches:
+        #     fun.unresolved_function.arguments.extend(condition.to_plan(session))
+        #     fun.unresolved_function.arguments.extend(value.to_plan(session))
+        # if self._else_value is not None:
+        #     fun.unresolved_function.arguments.extend(self._else_value.to_plan(session))
+        # return fun
+
+        args: Sequence[Expression] = []
+        for condition, value in self._branches:
+            args.append(condition)
+            args.append(value)
+
+        if self._else_value is not None:
+            args.append(self._else_value)
+
+        unresolved_function = UnresolvedFunction(name="when", args=args)
+
+        return unresolved_function.to_plan(session)

Review Comment:
   @hvanhovell  @amaliujia 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #38956: [DO_NOT_MERGE] Implement Column.{when, otherwise} and Function when with UnresolvedFunction

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on PR #38956:
URL: https://github.com/apache/spark/pull/38956#issuecomment-1346281724

   @hvanhovell I add the new constructor, but it kept failing in the invocation of constructor. I manually check the constructor and it seems fine. Do you have any ideas about it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on a diff in pull request #38956: [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function `when` with `UnresolvedFunction`

Posted by GitBox <gi...@apache.org>.
hvanhovell commented on code in PR #38956:
URL: https://github.com/apache/spark/pull/38956#discussion_r1047952066


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala:
##########
@@ -161,6 +161,13 @@ case class CaseWhen(
     elseValue: Option[Expression] = None)
   extends ComplexTypeMergingExpression with ConditionalExpression {
 
+  def this(branches: Seq[Expression]) = this(

Review Comment:
   Can we use the `CaseWhen.createFromParser`? You will need to register an `ExpressionBuilder` for this. The upside of this is that it is less messy in the constructor, and that there is ample testing for this from the parser.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #38956: [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function `when` with `UnresolvedFunction`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on code in PR #38956:
URL: https://github.com/apache/spark/pull/38956#discussion_r1047962638


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala:
##########
@@ -161,6 +161,13 @@ case class CaseWhen(
     elseValue: Option[Expression] = None)
   extends ComplexTypeMergingExpression with ConditionalExpression {
 
+  def this(branches: Seq[Expression]) = this(

Review Comment:
   oh, I mean we still use an unresolved function, and we can call `CaseWhen.createFromParser` in connect planner



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #38956: [DO_NOT_MERGE] Implement Column.{when, otherwise} and Function when with UnresolvedFunction

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on code in PR #38956:
URL: https://github.com/apache/spark/pull/38956#discussion_r1041722230


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala:
##########
@@ -161,6 +161,15 @@ case class CaseWhen(
     elseValue: Option[Expression] = None)
   extends ComplexTypeMergingExpression with ConditionalExpression {
 
+  def this(branches: Seq[Expression]) = this(

Review Comment:
   still fail with:
   ```
   grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
           status = StatusCode.UNKNOWN
           details = "[FAILED_FUNCTION_CALL] Failed preparing of the function `when` for call. Please, double check function's arguments."
           debug_error_string = "{"created":"@1670384303.828721000","description":"Error received from peer ipv6:[::1]:15002","file":"src/core/lib/surface/call.cc","file_line":1064,"grpc_message":"[FAILED_FUNCTION_CALL] Failed preparing of the function `when` for call. Please, double check function's arguments.","grpc_status":2}"
   >
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #38956: [DO_NOT_MERGE] Implement Column.{when, otherwise} and Function when with UnresolvedFunction

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on PR #38956:
URL: https://github.com/apache/spark/pull/38956#issuecomment-1340714677

   it keep failing in the invocation of constructor, I print the constructor as well as expressions in the error message, and they looks fine
   
   ```
   grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
           status = StatusCode.UNKNOWN
           details = "[FAILED_FUNCTION_CALL] Failed preparing of the function `when_1, constructor:public org_apache_spark_sql_catalyst_expressions_CaseWhen(scala_collection_Seq), exprs:class org_apache_spark_sql_catalyst_expressions_EqualTo:(cast(a#6 as bigint) = 0) , class org_apache_spark_sql_catalyst_expressions_Literal:1_0 , class org_apache_spark_sql_catalyst_expressions_Literal:2_0, error:null` for call. Please, double check function's arguments."
           debug_error_string = "{"created":"@1670408082.289473000","description":"Error received from peer ipv6:[::1]:15002","file":"src/core/lib/surface/call.cc","file_line":1064,"grpc_message":"[FAILED_FUNCTION_CALL] Failed preparing of the function `when_1, constructor:public org_apache_spark_sql_catalyst_expressions_CaseWhen(scala_collection_Seq), exprs:class org_apache_spark_sql_catalyst_expressions_EqualTo:(cast(a#6 as bigint) = 0) , class org_apache_spark_sql_catalyst_expressions_Literal:1_0 , class org_apache_spark_sql_catalyst_expressions_Literal:2_0, error:null` for call. Please, double check function's arguments.","grpc_status":2}"
   ```
   
   I also check in the shell:
   ```
   scala> import org.apache.spark.sql.catalyst.expressions._
   import org.apache.spark.sql.catalyst.expressions._
   
   scala> val casewhen = CaseWhen(Seq.empty, None)
   casewhen: org.apache.spark.sql.catalyst.expressions.CaseWhen = CASE END
   
   scala> val varargCtor = casewhen.getClass.getConstructors().find(_.getParameterTypes.toSeq == Seq(classOf[Seq[_]]))
   varargCtor: Option[java.lang.reflect.Constructor[_]] = Some(public org.apache.spark.sql.catalyst.expressions.CaseWhen(scala.collection.Seq))
   
   scala> val expressions = Seq((col("a").cast("long") === 0).expr, lit(1.0).expr, lit(2.0).expr)
   expressions: Seq[org.apache.spark.sql.catalyst.expressions.Expression] = List((cast('a as bigint) = 0), 1.0, 2.0)
   
   scala> varargCtor.get.newInstance(expressions)
   res10: Any = CASE WHEN (cast('a as bigint) = 0) THEN 1.0 ELSE 2.0 END
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #38956: [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function `when` with `UnresolvedFunction`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on PR #38956:
URL: https://github.com/apache/spark/pull/38956#issuecomment-1352461291

   cc @HyukjinKwon @cloud-fan @amaliujia 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on a diff in pull request #38956: [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function `when` with `UnresolvedFunction`

Posted by GitBox <gi...@apache.org>.
hvanhovell commented on code in PR #38956:
URL: https://github.com/apache/spark/pull/38956#discussion_r1047961655


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala:
##########
@@ -161,6 +161,13 @@ case class CaseWhen(
     elseValue: Option[Expression] = None)
   extends ComplexTypeMergingExpression with ConditionalExpression {
 
+  def this(branches: Seq[Expression]) = this(

Review Comment:
   I think it is better to keep the connect planner as generic as possible. If we can use an unresolved function then it is better.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #38956: [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function `when` with `UnresolvedFunction`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on code in PR #38956:
URL: https://github.com/apache/spark/pull/38956#discussion_r1047967339


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala:
##########
@@ -386,10 +386,9 @@ object CaseWhen {
    *                 position are branch values.
    */
   def createFromParser(branches: Seq[Expression]): CaseWhen = {
-    val cases = branches.grouped(2).flatMap {
-      case cond :: value :: Nil => Some((cond, value))
-      case value :: Nil => None
-    }.toArray.toSeq  // force materialization to make the seq serializable
+    val cases = branches.grouped(2).flatMap { g =>

Review Comment:
   `createFromParser` was never used
   
   make this change, since it causes `MatchError`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #38956: [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function `when` with `UnresolvedFunction`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on PR #38956:
URL: https://github.com/apache/spark/pull/38956#issuecomment-1348240765

   also cc @HyukjinKwon @cloud-fan @grundprinzip @amaliujia @xinrong-meng 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #38956: [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function `when` with `UnresolvedFunction`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on code in PR #38956:
URL: https://github.com/apache/spark/pull/38956#discussion_r1047955950


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala:
##########
@@ -161,6 +161,13 @@ case class CaseWhen(
     elseValue: Option[Expression] = None)
   extends ComplexTypeMergingExpression with ConditionalExpression {
 
+  def this(branches: Seq[Expression]) = this(

Review Comment:
   @hvanhovell  what about we create CaseWhen in the [connect planner ](https://github.com/apache/spark/blob/master/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala#L535-L554), then we don't need to change sql side at all.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng closed pull request #38956: [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function `when` with `UnresolvedFunction`

Posted by GitBox <gi...@apache.org>.
zhengruifeng closed pull request #38956: [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function `when` with `UnresolvedFunction`
URL: https://github.com/apache/spark/pull/38956


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #38956: [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function `when` with `UnresolvedFunction`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on code in PR #38956:
URL: https://github.com/apache/spark/pull/38956#discussion_r1047949470


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala:
##########
@@ -161,6 +161,13 @@ case class CaseWhen(
     elseValue: Option[Expression] = None)
   extends ComplexTypeMergingExpression with ConditionalExpression {
 
+  def this(branches: Seq[Expression]) = this(

Review Comment:
   let me see



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #38956: [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function `when` with `UnresolvedFunction`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on PR #38956:
URL: https://github.com/apache/spark/pull/38956#issuecomment-1354186229

   merged into master, thanks for the reviews


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on pull request #38956: [DO_NOT_MERGE] Implement Column.{when, otherwise} and Function when with UnresolvedFunction

Posted by GitBox <gi...@apache.org>.
hvanhovell commented on PR #38956:
URL: https://github.com/apache/spark/pull/38956#issuecomment-1346930086

   @zhengruifeng let's check how the constructor resolution works in the FunctionRegistry. Alternatively we can use `ExpressionBuilder` for `when`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #38956: [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function `when` with `UnresolvedFunction`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on PR #38956:
URL: https://github.com/apache/spark/pull/38956#issuecomment-1348219137

   Fixed, the root cause is the previous constructor copied from https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L388-L395  failed due to `MatchError` ...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org