You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/07 12:45:57 UTC

[GitHub] [spark] steven-aerts opened a new pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

steven-aerts opened a new pull request #35426:
URL: https://github.com/apache/spark/pull/35426


   The check in array_sort to check for orderable items can be removed.  As
   it is possible to pass a lambda to make the items orderable:
   
      Seq((Array[Map[String, Int]](Map("a" -> 1), Map()), "x")).toDF("a", "b")
        .selectExpr("array_sort(a, (x,y) -> cardinality(x) - cardinality(y))")
   
   And for the cases where it is relevant the check is never hit as the
   LessThan operator has a similar check which is evaluated first:
   
   > due to data type mismatch: LessThan does not support ordering on type map<string,int>
   
   ### What changes were proposed in this pull request?
   
   Remove a check which prevents you to sort an array with non-orderable items, with a lambda which makes everything orderable.
   
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   
   Extra test was added.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] steven-aerts commented on a change in pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
steven-aerts commented on a change in pull request #35426:
URL: https://github.com/apache/spark/pull/35426#discussion_r801304946



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
##########
@@ -388,18 +388,13 @@ case class ArraySort(
     checkArgumentDataTypes() match {
       case TypeCheckResult.TypeCheckSuccess =>
         argument.dataType match {
-          case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>

Review comment:
       This is not working, as a map is (currently) not orderable.
   With and without this patch this will fail with the following error:
   ```
   > Seq((Array[Map[String, Int]](Map("a" -> 1), Map()), "x")).toDF("a", "b").selectExpr("array_sort(a)")
   org.apache.spark.sql.AnalysisException: cannot resolve '(namedlambdavariable() < namedlambdavariable())' due to data type mismatch: LessThan does not support ordering on type map<string,int>; line 1 pos 0;
   ```
   Because this error message is emitted before the code flow can reach the logic which is removed in this patch.
   So as far as I can see, the removed logic can currently only be hit in cases where it should work, hence the removal.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #35426:
URL: https://github.com/apache/spark/pull/35426#discussion_r801184125



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
##########
@@ -388,18 +388,13 @@ case class ArraySort(
     checkArgumentDataTypes() match {
       case TypeCheckResult.TypeCheckSuccess =>
         argument.dataType match {
-          case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>

Review comment:
       Hm, what about the case like: `array_sort((x, y) -> x == y)` where `x` and `y` are map?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] steven-aerts commented on pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
steven-aerts commented on pull request #35426:
URL: https://github.com/apache/spark/pull/35426#issuecomment-1033844679


   Hi @cloud-fan,
   when `array_sort` does not have the comparator, it falls back to the `ArraySort.defaultComparator` which in itself is an expression containing a `LessThan` which does the order ability check, and generate a message with or without this patch.  See the [above comment](https://github.com/apache/spark/pull/35426#discussion_r801304946) for more details.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] steven-aerts commented on a change in pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
steven-aerts commented on a change in pull request #35426:
URL: https://github.com/apache/spark/pull/35426#discussion_r800750011



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
##########
@@ -388,18 +388,13 @@ case class ArraySort(
     checkArgumentDataTypes() match {
       case TypeCheckResult.TypeCheckSuccess =>
         argument.dataType match {
-          case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>
+          case ArrayType(_, _) =>

Review comment:
       Hi @srowen ,
   currently this error message is never shown in the cases where it is relevant.  As the error message on `LessThan` (see above) takes precedence.
   
   I did some experiments but never got it right.  I ran into two problems:
   * `checkInputDataTypes` is always called before the `checkInputDataTypes` on the `array_sort` or the lambda function.
   * extending `LambdaFunction` is not possible because it is a `case class` and the `higherOrderFunctions` can only work with `LambdaFunction` objects.
   
   If somebody has an idea on how to show the error before `checkInputDataTypes` of `LessThan` is hit I am happy to try it out.
   
   Steven
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #35426:
URL: https://github.com/apache/spark/pull/35426


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #35426:
URL: https://github.com/apache/spark/pull/35426#discussion_r801184125



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
##########
@@ -388,18 +388,13 @@ case class ArraySort(
     checkArgumentDataTypes() match {
       case TypeCheckResult.TypeCheckSuccess =>
         argument.dataType match {
-          case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>

Review comment:
       Hm, what about the case like: `array_sort((x, y) => x == y)` where `x` and `y` are map?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] steven-aerts commented on pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
steven-aerts commented on pull request #35426:
URL: https://github.com/apache/spark/pull/35426#issuecomment-1040361745


   Rebased, thanks @srowen for looking into this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #35426:
URL: https://github.com/apache/spark/pull/35426


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #35426:
URL: https://github.com/apache/spark/pull/35426#issuecomment-1040584503


   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #35426:
URL: https://github.com/apache/spark/pull/35426#issuecomment-1031569175


   Maybe @cloud-fan or @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #35426:
URL: https://github.com/apache/spark/pull/35426#issuecomment-1033809173


   This makes sense to me. `array_sort` takes a comparator parameter, and this comparator expression should do the work to check order-ability. But `array_sort` still need to check order-ability if comparator parameter is not given.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] steven-aerts commented on a change in pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
steven-aerts commented on a change in pull request #35426:
URL: https://github.com/apache/spark/pull/35426#discussion_r804596191



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
##########
@@ -388,18 +388,13 @@ case class ArraySort(
     checkArgumentDataTypes() match {
       case TypeCheckResult.TypeCheckSuccess =>
         argument.dataType match {
-          case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>

Review comment:
       Added extra comment at defaultComparator to explain.

##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
##########
@@ -388,18 +388,13 @@ case class ArraySort(
     checkArgumentDataTypes() match {
       case TypeCheckResult.TypeCheckSuccess =>
         argument.dataType match {
-          case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>
+          case ArrayType(_, _) =>

Review comment:
       Added extra comment at defaultComparator to explain.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #35426:
URL: https://github.com/apache/spark/pull/35426#discussion_r801184125



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
##########
@@ -388,18 +388,13 @@ case class ArraySort(
     checkArgumentDataTypes() match {
       case TypeCheckResult.TypeCheckSuccess =>
         argument.dataType match {
-          case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>

Review comment:
       Hm, what about the case like: `array_sort((x, y) -> x > y)` where `x` and `y` are map?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] steven-aerts commented on a change in pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
steven-aerts commented on a change in pull request #35426:
URL: https://github.com/apache/spark/pull/35426#discussion_r802548282



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
##########
@@ -388,18 +388,13 @@ case class ArraySort(
     checkArgumentDataTypes() match {
       case TypeCheckResult.TypeCheckSuccess =>
         argument.dataType match {
-          case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>

Review comment:
       Your interpretation is correct.  `LessThan` fails for lambda's like this when x and y are structs.
   `array_sort` does not accept `x > y` because it is not an integer function, but that is a detail here I think.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] steven-aerts commented on pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
steven-aerts commented on pull request #35426:
URL: https://github.com/apache/spark/pull/35426#issuecomment-1040361745


   Rebased, thanks @srowen for looking into this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #35426:
URL: https://github.com/apache/spark/pull/35426#discussion_r801365062



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
##########
@@ -388,18 +388,13 @@ case class ArraySort(
     checkArgumentDataTypes() match {
       case TypeCheckResult.TypeCheckSuccess =>
         argument.dataType match {
-          case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>

Review comment:
       Doesn't the above cast fail because there is `LessThan` used in the default comparator? @HyukjinKwon's case provides a lambda function. As you remove the check, the lambda will be executed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #35426:
URL: https://github.com/apache/spark/pull/35426#issuecomment-1034504598


   OK makes sense to me. Can you add some code comments to explain it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #35426:
URL: https://github.com/apache/spark/pull/35426#issuecomment-1040584503


   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #35426:
URL: https://github.com/apache/spark/pull/35426#discussion_r801365062



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
##########
@@ -388,18 +388,13 @@ case class ArraySort(
     checkArgumentDataTypes() match {
       case TypeCheckResult.TypeCheckSuccess =>
         argument.dataType match {
-          case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>

Review comment:
       Doesn't the above cast fail because there is `LessThan` used in the default comparator? @HyukjinKwon's case provides a lambda function. As you remove the check, the lambda will be executed to compare the maps.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #35426:
URL: https://github.com/apache/spark/pull/35426#issuecomment-1031618297


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #35426:
URL: https://github.com/apache/spark/pull/35426#issuecomment-1040341786


   Hm, this is failing, but I don't think it can be related: ` SPARK-37555: spark-sql should pass last unclosed comment to backend *** FAILED *** (2 minutes, 10 seconds)` I tried retriggering the job but not sure that's sufficient. Eh, @steven-aerts can you rebase and push or something to kick it from your end? not entirely sure how this works


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #35426:
URL: https://github.com/apache/spark/pull/35426#issuecomment-1040341786


   Hm, this is failing, but I don't think it can be related: ` SPARK-37555: spark-sql should pass last unclosed comment to backend *** FAILED *** (2 minutes, 10 seconds)` I tried retriggering the job but not sure that's sufficient. Eh, @steven-aerts can you rebase and push or something to kick it from your end? not entirely sure how this works


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #35426:
URL: https://github.com/apache/spark/pull/35426#discussion_r800736584



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
##########
@@ -388,18 +388,13 @@ case class ArraySort(
     checkArgumentDataTypes() match {
       case TypeCheckResult.TypeCheckSuccess =>
         argument.dataType match {
-          case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>
+          case ArrayType(_, _) =>

Review comment:
       I don't know this code well, but is it possible to check it's orderable only if there is no lambda? or check if the lambda's result is orderable if present instead? It does make sense to not fail this case but just wondering if we can preserve a better error message




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] steven-aerts commented on a change in pull request #35426: [SPARK-38130][SQL] Remove array_sort orderable entries check

Posted by GitBox <gi...@apache.org>.
steven-aerts commented on a change in pull request #35426:
URL: https://github.com/apache/spark/pull/35426#discussion_r802556994



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
##########
@@ -388,18 +388,13 @@ case class ArraySort(
     checkArgumentDataTypes() match {
       case TypeCheckResult.TypeCheckSuccess =>
         argument.dataType match {
-          case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>

Review comment:
       Your interpretation is correct.  `LessThan` fails for lambda's like this when x and y are structs.
   `array_sort` does not accept `x > y` because it is not an integer function, but that is a detail here I think.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org