You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/12/06 10:13:34 UTC

[GitHub] [spark] zhengruifeng opened a new pull request, #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`

zhengruifeng opened a new pull request, #38935:
URL: https://github.com/apache/spark/pull/38935

   ### What changes were proposed in this pull request?
   Implement `Column.{when, otherwise}` and Function `when`
   
   
   ### Why are the changes needed?
   For API coverage
   
   
   ### Does this PR introduce _any_ user-facing change?
   yes
   
   
   ### How was this patch tested?
   added UT


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] amaliujia commented on a diff in pull request #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`

Posted by GitBox <gi...@apache.org>.
amaliujia commented on code in PR #38935:
URL: https://github.com/apache/spark/pull/38935#discussion_r1041613292


##########
connector/connect/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -173,4 +174,18 @@ message Expression {
     // (Optional) Alias metadata expressed as a JSON map.
     optional string metadata = 3;
   }
+
+  // Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END".
+  message CaseWhen {

Review Comment:
   If we use this way then have to document it clear for even/odd assumption on the server side.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on a diff in pull request #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`

Posted by GitBox <gi...@apache.org>.
hvanhovell commented on code in PR #38935:
URL: https://github.com/apache/spark/pull/38935#discussion_r1041616815


##########
connector/connect/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -173,4 +174,18 @@ message Expression {
     // (Optional) Alias metadata expressed as a JSON map.
     optional string metadata = 3;
   }
+
+  // Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END".
+  message CaseWhen {

Review Comment:
   Yeah we do. I am just trying to see if we can keep the number of expressions in the proto to a bare minimum. In this case you can get away with it because it sort of makes sense.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on a diff in pull request #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`

Posted by GitBox <gi...@apache.org>.
hvanhovell commented on code in PR #38935:
URL: https://github.com/apache/spark/pull/38935#discussion_r1041599340


##########
connector/connect/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -173,4 +174,18 @@ message Expression {
     // (Optional) Alias metadata expressed as a JSON map.
     optional string metadata = 3;
   }
+
+  // Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END".
+  message CaseWhen {

Review Comment:
   Why not make this an unresolved_function? All it takes is that we add a constructor to CaseWhen.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] amaliujia commented on a diff in pull request #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`

Posted by GitBox <gi...@apache.org>.
amaliujia commented on code in PR #38935:
URL: https://github.com/apache/spark/pull/38935#discussion_r1041391719


##########
connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##########
@@ -587,6 +588,14 @@ class SparkConnectPlanner(session: SparkSession) {
       DataTypeProtoConverter.toCatalystType(cast.getCastToType))
   }
 
+  private def transformCaseWhen(casewhen: proto.Expression.CaseWhen): Expression = {
+    CaseWhen(
+      branches = casewhen.getBranchesList.asScala
+        .map(b => (transformExpression(b.getCondition), transformExpression(b.getValue))),

Review Comment:
   [error] /home/runner/work/spark/spark/connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:594:13: type mismatch;
   [error]  found   : scala.collection.mutable.Buffer[(org.apache.spark.sql.catalyst.expressions.Expression, org.apache.spark.sql.catalyst.expressions.Expression)]
   [error]  required: Seq[(org.apache.spark.sql.catalyst.expressions.Expression, org.apache.spark.sql.catalyst.expressions.Expression)]
   [error]         .map(b => (transformExpression(b.getCondition), transformExpression(b.getValue))),



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] amaliujia commented on a diff in pull request #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`

Posted by GitBox <gi...@apache.org>.
amaliujia commented on code in PR #38935:
URL: https://github.com/apache/spark/pull/38935#discussion_r1041392021


##########
connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##########
@@ -587,6 +588,14 @@ class SparkConnectPlanner(session: SparkSession) {
       DataTypeProtoConverter.toCatalystType(cast.getCastToType))
   }
 
+  private def transformCaseWhen(casewhen: proto.Expression.CaseWhen): Expression = {
+    CaseWhen(
+      branches = casewhen.getBranchesList.asScala
+        .map(b => (transformExpression(b.getCondition), transformExpression(b.getValue))),

Review Comment:
   guessing you need a `.toSeq`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] amaliujia commented on a diff in pull request #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`

Posted by GitBox <gi...@apache.org>.
amaliujia commented on code in PR #38935:
URL: https://github.com/apache/spark/pull/38935#discussion_r1041608123


##########
connector/connect/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -173,4 +174,18 @@ message Expression {
     // (Optional) Alias metadata expressed as a JSON map.
     optional string metadata = 3;
   }
+
+  // Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END".
+  message CaseWhen {

Review Comment:
   hmm for `unresolved_function`, how to deal with the optional `else`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on code in PR #38935:
URL: https://github.com/apache/spark/pull/38935#discussion_r1041682838


##########
connector/connect/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -173,4 +174,18 @@ message Expression {
     // (Optional) Alias metadata expressed as a JSON map.
     optional string metadata = 3;
   }
+
+  // Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END".
+  message CaseWhen {

Review Comment:
   got it, let me have a try, thanks



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on code in PR #38935:
URL: https://github.com/apache/spark/pull/38935#discussion_r1041674284


##########
connector/connect/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -173,4 +174,18 @@ message Expression {
     // (Optional) Alias metadata expressed as a JSON map.
     optional string metadata = 3;
   }
+
+  // Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END".
+  message CaseWhen {

Review Comment:
   I am not sure whether unresolved_function  `when`  can support `otherwise`, let me take a deeper look



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on code in PR #38935:
URL: https://github.com/apache/spark/pull/38935#discussion_r1043123962


##########
connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##########
@@ -587,6 +588,14 @@ class SparkConnectPlanner(session: SparkSession) {
       DataTypeProtoConverter.toCatalystType(cast.getCastToType))
   }
 
+  private def transformCaseWhen(casewhen: proto.Expression.CaseWhen): Expression = {
+    CaseWhen(
+      branches = casewhen.getBranchesList.asScala
+        .map(b => (transformExpression(b.getCondition), transformExpression(b.getValue))),

Review Comment:
   thanks



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on a diff in pull request #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`

Posted by GitBox <gi...@apache.org>.
hvanhovell commented on code in PR #38935:
URL: https://github.com/apache/spark/pull/38935#discussion_r1041599792


##########
connector/connect/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -173,4 +174,18 @@ message Expression {
     // (Optional) Alias metadata expressed as a JSON map.
     optional string metadata = 3;
   }
+
+  // Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END".
+  message CaseWhen {

Review Comment:
   Alternatively you can rewrite this into a series of ifs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on a diff in pull request #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`

Posted by GitBox <gi...@apache.org>.
hvanhovell commented on code in PR #38935:
URL: https://github.com/apache/spark/pull/38935#discussion_r1041609765


##########
connector/connect/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -173,4 +174,18 @@ message Expression {
     // (Optional) Alias metadata expressed as a JSON map.
     optional string metadata = 3;
   }
+
+  // Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END".
+  message CaseWhen {

Review Comment:
   If the the number of expressions is uneven, the last expression is the else expression. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on a diff in pull request #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`

Posted by GitBox <gi...@apache.org>.
hvanhovell commented on code in PR #38935:
URL: https://github.com/apache/spark/pull/38935#discussion_r1041675179


##########
connector/connect/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -173,4 +174,18 @@ message Expression {
     // (Optional) Alias metadata expressed as a JSON map.
     optional string metadata = 3;
   }
+
+  // Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END".
+  message CaseWhen {

Review Comment:
   Otherwise is the `elseValue` in `CaseWhen`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng closed pull request #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`

Posted by GitBox <gi...@apache.org>.
zhengruifeng closed pull request #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`
URL: https://github.com/apache/spark/pull/38935


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #38935: [SPARK-41319][CONNECT][PYTHON] Implement `Column.{when, otherwise}` and Function `when`

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on PR #38935:
URL: https://github.com/apache/spark/pull/38935#issuecomment-1350224852

   close this one in favor of https://github.com/apache/spark/pull/38956


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org