You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2020/03/06 01:38:26 UTC

[spark] branch branch-3.0 updated: [SPARK-31036][SQL] Use stringArgs in Expression.toString to respect hidden parameters

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
     new 5220a1c  [SPARK-31036][SQL] Use stringArgs in Expression.toString to respect hidden parameters
5220a1c is described below

commit 5220a1c8756a8ddcf015001c7ab5d8fa02ab8692
Author: HyukjinKwon <gu...@apache.org>
AuthorDate: Fri Mar 6 10:33:20 2020 +0900

    [SPARK-31036][SQL] Use stringArgs in Expression.toString to respect hidden parameters
    
    ### What changes were proposed in this pull request?
    
    This PR proposes to respect hidden parameters by using `stringArgs`  in `Expression.toString `. By this, we can show the strings properly in some cases such as `NonSQLExpression`.
    
    ### Why are the changes needed?
    
    To respect "hidden" arguments in the string representation.
    
    ### Does this PR introduce any user-facing change?
    
    Yes, for example, on the top of https://github.com/apache/spark/pull/27657,
    
    ```scala
    val identify = udf((input: Seq[Int]) => input)
    spark.range(10).select(identify(array("id"))).show()
    ```
    
    shows hidden parameter `useStringTypeWhenEmpty`.
    
    ```
    +---------------------+
    |UDF(array(id, false))|
    +---------------------+
    |                  [0]|
    |                  [1]|
    ...
    ```
    
    whereas:
    
    ```scala
    spark.range(10).select(array("id")).show()
    ```
    
    ```
    +---------+
    |array(id)|
    +---------+
    |      [0]|
    |      [1]|
    ...
    ```
    
    ### How was this patch tested?
    
    Manually tested as below:
    
    ```scala
    val identify = udf((input: Boolean) => input)
    spark.range(10).select(identify(exists(array(col("id")), _ % 2 === 0))).show()
    ```
    
    Before:
    
    ```
    +-------------------------------------------------------------------------------------+
    |UDF(exists(array(id), lambdafunction(((lambda 'x % 2) = 0), lambda 'x, false), true))|
    +-------------------------------------------------------------------------------------+
    |                                                                                 true|
    |                                                                                false|
    |                                                                                 true|
    ...
    ```
    
    After:
    
    ```
    +-------------------------------------------------------------------------------+
    |UDF(exists(array(id), lambdafunction(((lambda 'x % 2) = 0), lambda 'x, false)))|
    +-------------------------------------------------------------------------------+
    |                                                                           true|
    |                                                                          false|
    |                                                                           true|
    ...
    ```
    
    Closes #27788 from HyukjinKwon/arguments-str-repr.
    
    Authored-by: HyukjinKwon <gu...@apache.org>
    Signed-off-by: HyukjinKwon <gu...@apache.org>
    (cherry picked from commit fc12165f48b2e1dfe04116ddaa6ff6e8650a18fb)
    Signed-off-by: HyukjinKwon <gu...@apache.org>
---
 .../scala/org/apache/spark/sql/catalyst/expressions/Expression.scala    | 2 +-
 .../apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala    | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
index 4632957..1599321 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
@@ -260,7 +260,7 @@ abstract class Expression extends TreeNode[Expression] {
    */
   def prettyName: String = nodeName.toLowerCase(Locale.ROOT)
 
-  protected def flatArguments: Iterator[Any] = productIterator.flatMap {
+  protected def flatArguments: Iterator[Any] = stringArgs.flatMap {
     case t: Iterable[_] => t
     case single => single :: Nil
   }
diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
index 9dd4263..e91bd0c 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
@@ -533,6 +533,8 @@ case class ArrayExists(
       SQLConf.get.getConf(SQLConf.LEGACY_ARRAY_EXISTS_FOLLOWS_THREE_VALUED_LOGIC))
   }
 
+  override def stringArgs: Iterator[Any] = super.stringArgs.take(2)
+
   override def nullable: Boolean =
     if (followThreeValuedLogic) {
       super.nullable || function.nullable


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org