You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by mgaido91 <gi...@git.apache.org> on 2018/08/03 15:21:14 UTC

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

GitHub user mgaido91 opened a pull request:

    https://github.com/apache/spark/pull/21986

    [SPARK-23937][SQL] Add map_filter SQL function

    ## What changes were proposed in this pull request?
    
    The PR adds the high order function `map_filter`, which filters the entries of a map and returns a new map which contains only the entries which satisfied the filter function.
    
    ## How was this patch tested?
    
    added UTs


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mgaido91/spark SPARK-23937

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21986.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21986
    
----
commit 3f88e2a927c22f4fc509b8ca96027ef381f7fe84
Author: Marco Gaido <ma...@...>
Date:   2018-08-03T15:16:11Z

    [SPARK-23937][SQL] Add map_filter SQL function

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by ueshin <gi...@git.apache.org>.

Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r207965203
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -205,29 +230,82 @@ case class ArrayTransform(
         (elementVar, indexVar)
       }
     
    -  override def eval(input: InternalRow): Any = {
    -    val arr = this.input.eval(input).asInstanceOf[ArrayData]
    -    if (arr == null) {
    -      null
    -    } else {
    -      val f = functionForEval
    -      val result = new GenericArrayData(new Array[Any](arr.numElements))
    -      var i = 0
    -      while (i < arr.numElements) {
    -        elementVar.value.set(arr.get(i, elementVar.dataType))
    -        if (indexVar.isDefined) {
    -          indexVar.get.value.set(i)
    -        }
    -        result.update(i, f.eval(input))
    -        i += 1
    +  override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
    +    val arr = inputValue.asInstanceOf[ArrayData]
    +    val f = functionForEval
    +    val result = new GenericArrayData(new Array[Any](arr.numElements))
    +    var i = 0
    +    while (i < arr.numElements) {
    +      elementVar.value.set(arr.get(i, elementVar.dataType))
    +      if (indexVar.isDefined) {
    +        indexVar.get.value.set(i)
           }
    -      result
    +      result.update(i, f.eval(inputRow))
    +      i += 1
         }
    +    result
       }
     
       override def prettyName: String = "transform"
     }
     
    +/**
    + * Filters entries in a map using the provided function.
    + */
    +@ExpressionDescription(
    +usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
    +examples = """
    +    Examples:
    +      > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
    +       [1 -> 0, 3 -> -1]
    +  """,
    +since = "2.4.0")
    +case class MapFilter(
    +    input: Expression,
    +    function: Expression)
    +  extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
    +
    +  @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
    +    case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
    +    case _ =>
    +      val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
    +      (kType, vType, vContainsNull)
    +  }
    --- End diff --
    
    Sorry, I meant something like:
    
    ```scala
    object MapBasedUnaryHigherOrderFunction {
    
      def keyValueArgumentType(dt: DataType): (DataType, DataType, Boolean) = {
        dt match {
          case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
          case _ =>
            val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
            (kType, vType, vContainsNull)
        }
      }
    }
    
    ...
    
    case class MapFilter( ... ) {
      ...
      @transient val (keyType, valueType, valueContainsNull) =
        MapBasedUnaryHigherOrderFunction.keyValueArgumentType(input.dataType)
      ...
    }
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by ueshin <gi...@git.apache.org>.

Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r207703208
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -210,3 +221,66 @@ case class ArrayTransform(
     
       override def prettyName: String = "transform"
     }
    +
    +/**
    + * Filters entries in a map using the provided function.
    + */
    +@ExpressionDescription(
    +usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
    +examples = """
    +    Examples:
    +      > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
    +       [1 -> 0, 3 -> -1]
    +  """,
    +since = "2.4.0")
    +case class MapFilter(
    +    input: Expression,
    +    function: Expression)
    +  extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
    +
    +  @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
    +    case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
    +    case _ =>
    +      val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
    +      (kType, vType, vContainsNull)
    +  }
    +
    +  @transient lazy val (keyVar, valueVar) = {
    +    val args = function.asInstanceOf[LambdaFunction].arguments
    +    (args.head.asInstanceOf[NamedLambdaVariable], args.tail.head.asInstanceOf[NamedLambdaVariable])
    +  }
    +
    +  override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction): MapFilter = {
    +    function match {
    +      case LambdaFunction(_, _, _) =>
    +        copy(function = f(function, (keyType, false) :: (valueType, valueContainsNull) :: Nil))
    +    }
    +  }
    +
    +  override def nullable: Boolean = input.nullable
    +
    +  override def eval(input: InternalRow): Any = {
    +    val m = this.input.eval(input).asInstanceOf[MapData]
    +    if (m == null) {
    +      null
    +    } else {
    +      val retKeys = new mutable.ListBuffer[Any]
    +      val retValues = new mutable.ListBuffer[Any]
    --- End diff --
    
    I'm just curious that `ListBuffer` is better than `ArrayBuffer`? If so, should we rewrite in `ArrayFilter`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94272 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94272/testReport)** for PR 21986 at commit [`9bbaa3b`](https://github.com/apache/spark/commit/9bbaa3b18493fe5e77652b7f39bdc5a6732771bb).
     * This patch passes all tests.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r207921448
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -205,29 +230,85 @@ case class ArrayTransform(
         (elementVar, indexVar)
       }
     
    -  override def eval(input: InternalRow): Any = {
    -    val arr = this.input.eval(input).asInstanceOf[ArrayData]
    -    if (arr == null) {
    -      null
    -    } else {
    -      val f = functionForEval
    -      val result = new GenericArrayData(new Array[Any](arr.numElements))
    -      var i = 0
    -      while (i < arr.numElements) {
    -        elementVar.value.set(arr.get(i, elementVar.dataType))
    -        if (indexVar.isDefined) {
    -          indexVar.get.value.set(i)
    -        }
    -        result.update(i, f.eval(input))
    -        i += 1
    +  override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
    +    val arr = inputValue.asInstanceOf[ArrayData]
    +    val f = functionForEval
    +    val result = new GenericArrayData(new Array[Any](arr.numElements))
    +    var i = 0
    +    while (i < arr.numElements) {
    +      elementVar.value.set(arr.get(i, elementVar.dataType))
    +      if (indexVar.isDefined) {
    +        indexVar.get.value.set(i)
           }
    -      result
    +      result.update(i, f.eval(inputRow))
    +      i += 1
         }
    +    result
       }
     
       override def prettyName: String = "transform"
     }
     
    +/**
    + * Filters entries in a map using the provided function.
    + */
    +@ExpressionDescription(
    +usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
    +examples = """
    +    Examples:
    +      > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
    +       [1 -> 0, 3 -> -1]
    +  """,
    +since = "2.4.0")
    +case class MapFilter(
    +    input: Expression,
    +    function: Expression)
    +  extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
    +
    +  @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
    +    case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
    +    case _ =>
    +      val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
    +      (kType, vType, vContainsNull)
    +  }
    +
    +  @transient lazy val (keyVar, valueVar) = {
    +    val args = function.asInstanceOf[LambdaFunction].arguments
    +    (args.head.asInstanceOf[NamedLambdaVariable], args.tail.head.asInstanceOf[NamedLambdaVariable])
    +  }
    +
    +  override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction): MapFilter = {
    +    function match {
    +      case LambdaFunction(_, _, _) =>
    --- End diff --
    
    right, I am removing it, thanks


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94272 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94272/testReport)** for PR 21986 at commit [`9bbaa3b`](https://github.com/apache/spark/commit/9bbaa3b18493fe5e77652b7f39bdc5a6732771bb).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94280 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94280/testReport)** for PR 21986 at commit [`37e221c`](https://github.com/apache/spark/commit/37e221c2eb79ec43e61ed2b4a61f206100eaeb42).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r207816072
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -210,3 +221,66 @@ case class ArrayTransform(
     
       override def prettyName: String = "transform"
     }
    +
    +/**
    + * Filters entries in a map using the provided function.
    + */
    +@ExpressionDescription(
    +usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
    +examples = """
    +    Examples:
    +      > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
    +       [1 -> 0, 3 -> -1]
    +  """,
    +since = "2.4.0")
    +case class MapFilter(
    +    input: Expression,
    +    function: Expression)
    +  extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
    +
    +  @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
    +    case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
    +    case _ =>
    +      val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
    +      (kType, vType, vContainsNull)
    +  }
    +
    +  @transient lazy val (keyVar, valueVar) = {
    +    val args = function.asInstanceOf[LambdaFunction].arguments
    +    (args.head.asInstanceOf[NamedLambdaVariable], args.tail.head.asInstanceOf[NamedLambdaVariable])
    +  }
    +
    +  override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction): MapFilter = {
    +    function match {
    +      case LambdaFunction(_, _, _) =>
    +        copy(function = f(function, (keyType, false) :: (valueType, valueContainsNull) :: Nil))
    +    }
    +  }
    +
    +  override def nullable: Boolean = input.nullable
    +
    +  override def eval(input: InternalRow): Any = {
    +    val m = this.input.eval(input).asInstanceOf[MapData]
    +    if (m == null) {
    +      null
    +    } else {
    +      val retKeys = new mutable.ListBuffer[Any]
    +      val retValues = new mutable.ListBuffer[Any]
    --- End diff --
    
    But I just checked that in `ArrayFilter` you initialized it with the number of incoming elements. So i think there is no difference in terms of performance, as using an upper value for the number of output elements we are sure no copy is performed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by ueshin <gi...@git.apache.org>.

Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r207702742
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HigherOrderFunctionsSuite.scala ---
    @@ -94,4 +94,53 @@ class HigherOrderFunctionsSuite extends SparkFunSuite with ExpressionEvalHelper
         checkEvaluation(transform(aai, array => Cast(transform(array, plusIndex), StringType)),
           Seq("[1, 3, 5]", null, "[4, 6]"))
       }
    +
    +  test("MapFilter") {
    +    def mapFilter(expr: Expression, f: (Expression, Expression) => Expression): Expression = {
    +      val mt = expr.dataType.asInstanceOf[MapType]
    +      MapFilter(expr, createLambda(mt.keyType, false, mt.valueType, mt.valueContainsNull, f))
    +    }
    +    val mii0 = Literal.create(Map(1 -> 0, 2 -> 10, 3 -> -1),
    +      MapType(IntegerType, IntegerType, valueContainsNull = false))
    +    val mii1 = Literal.create(Map(1 -> null, 2 -> 10, 3 -> null),
    +      MapType(IntegerType, IntegerType, valueContainsNull = true))
    +    val miin = Literal.create(null, MapType(IntegerType, IntegerType, valueContainsNull = false))
    +
    +    val kGreaterThanV: (Expression, Expression) => Expression = (k, v) => k > v
    +
    +    checkEvaluation(mapFilter(mii0, kGreaterThanV), Map(1 -> 0, 3 -> -1))
    +    checkEvaluation(mapFilter(mii1, kGreaterThanV), Map())
    +    checkEvaluation(mapFilter(miin, kGreaterThanV), null)
    +
    +    val valueNull: (Expression, Expression) => Expression = (_, v) => v.isNull
    --- End diff --
    
    nit: `valueIsNull`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94289/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94141 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94141/testReport)** for PR 21986 at commit [`3f88e2a`](https://github.com/apache/spark/commit/3f88e2a927c22f4fc509b8ca96027ef381f7fe84).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes `
      * `trait ArrayBasedUnaryHigherOrderFunction extends UnaryHigherOrderFunction `
      * `trait MapBasedUnaryHigherOrderFunction extends UnaryHigherOrderFunction `
      * `case class MapFilter(`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1911/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1904/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94170 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94170/testReport)** for PR 21986 at commit [`3f88e2a`](https://github.com/apache/spark/commit/3f88e2a927c22f4fc509b8ca96027ef381f7fe84).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by ueshin <gi...@git.apache.org>.

Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r207923151
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -205,29 +230,82 @@ case class ArrayTransform(
         (elementVar, indexVar)
       }
     
    -  override def eval(input: InternalRow): Any = {
    -    val arr = this.input.eval(input).asInstanceOf[ArrayData]
    -    if (arr == null) {
    -      null
    -    } else {
    -      val f = functionForEval
    -      val result = new GenericArrayData(new Array[Any](arr.numElements))
    -      var i = 0
    -      while (i < arr.numElements) {
    -        elementVar.value.set(arr.get(i, elementVar.dataType))
    -        if (indexVar.isDefined) {
    -          indexVar.get.value.set(i)
    -        }
    -        result.update(i, f.eval(input))
    -        i += 1
    +  override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
    +    val arr = inputValue.asInstanceOf[ArrayData]
    +    val f = functionForEval
    +    val result = new GenericArrayData(new Array[Any](arr.numElements))
    +    var i = 0
    +    while (i < arr.numElements) {
    +      elementVar.value.set(arr.get(i, elementVar.dataType))
    +      if (indexVar.isDefined) {
    +        indexVar.get.value.set(i)
           }
    -      result
    +      result.update(i, f.eval(inputRow))
    +      i += 1
         }
    +    result
       }
     
       override def prettyName: String = "transform"
     }
     
    +/**
    + * Filters entries in a map using the provided function.
    + */
    +@ExpressionDescription(
    +usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
    +examples = """
    +    Examples:
    +      > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
    +       [1 -> 0, 3 -> -1]
    +  """,
    +since = "2.4.0")
    +case class MapFilter(
    +    input: Expression,
    +    function: Expression)
    +  extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
    +
    +  @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
    +    case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
    +    case _ =>
    +      val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
    +      (kType, vType, vContainsNull)
    +  }
    --- End diff --
    
    How about extracting this to `object MapBasedUnaryHigherOrderFunction` like array based one? We'll need this in other map based ones.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94141 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94141/testReport)** for PR 21986 at commit [`3f88e2a`](https://github.com/apache/spark/commit/3f88e2a927c22f4fc509b8ca96027ef381f7fe84).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r207813180
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -210,3 +221,66 @@ case class ArrayTransform(
     
       override def prettyName: String = "transform"
     }
    +
    +/**
    + * Filters entries in a map using the provided function.
    + */
    +@ExpressionDescription(
    +usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
    +examples = """
    +    Examples:
    +      > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
    +       [1 -> 0, 3 -> -1]
    +  """,
    +since = "2.4.0")
    +case class MapFilter(
    +    input: Expression,
    +    function: Expression)
    +  extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
    +
    +  @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
    +    case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
    +    case _ =>
    +      val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
    +      (kType, vType, vContainsNull)
    +  }
    +
    +  @transient lazy val (keyVar, valueVar) = {
    +    val args = function.asInstanceOf[LambdaFunction].arguments
    +    (args.head.asInstanceOf[NamedLambdaVariable], args.tail.head.asInstanceOf[NamedLambdaVariable])
    +  }
    +
    +  override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction): MapFilter = {
    +    function match {
    +      case LambdaFunction(_, _, _) =>
    +        copy(function = f(function, (keyType, false) :: (valueType, valueContainsNull) :: Nil))
    +    }
    +  }
    +
    +  override def nullable: Boolean = input.nullable
    +
    +  override def eval(input: InternalRow): Any = {
    +    val m = this.input.eval(input).asInstanceOf[MapData]
    +    if (m == null) {
    +      null
    +    } else {
    +      val retKeys = new mutable.ListBuffer[Any]
    +      val retValues = new mutable.ListBuffer[Any]
    --- End diff --
    
    I think it is better as here we are always appending (and then creating an array from it). Appending a value is always O(1) for `ListBuffer`, while in `ArrayBuffer` it is: O(1) if the length of the underlying allocated array is bigger than the number of elements in the list plus one, O(n) otherwise (since it has to create a new array and copy the old one). As the initial value for the length of the underlying array in `ArrayBuffer` is 16, this means that for output values with more than 16 elements `ListBuffer` saves at least one copy.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/21986


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by ueshin <gi...@git.apache.org>.

Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    LGTM.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94291 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94291/testReport)** for PR 21986 at commit [`9c25ae6`](https://github.com/apache/spark/commit/9c25ae66b7fd0e3d5f11e3e097af32ef72a55e76).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1839/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94199/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94170/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by ueshin <gi...@git.apache.org>.

Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r207921840
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -123,7 +125,10 @@ trait HigherOrderFunction extends Expression {
       }
     }
     
    -trait ArrayBasedHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
    +/**
    + * Trait for functions having as input one argument and one function.
    + */
    +trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
    --- End diff --
    
    cc @hvanhovell for the naming?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94199 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94199/testReport)** for PR 21986 at commit [`3f88e2a`](https://github.com/apache/spark/commit/3f88e2a927c22f4fc509b8ca96027ef381f7fe84).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1849/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94272/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94273 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94273/testReport)** for PR 21986 at commit [`37e221c`](https://github.com/apache/spark/commit/37e221c2eb79ec43e61ed2b4a61f206100eaeb42).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1842/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by ueshin <gi...@git.apache.org>.

Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Thanks! merging to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r208181612
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -205,29 +230,82 @@ case class ArrayTransform(
         (elementVar, indexVar)
       }
     
    -  override def eval(input: InternalRow): Any = {
    -    val arr = this.input.eval(input).asInstanceOf[ArrayData]
    -    if (arr == null) {
    -      null
    -    } else {
    -      val f = functionForEval
    -      val result = new GenericArrayData(new Array[Any](arr.numElements))
    -      var i = 0
    -      while (i < arr.numElements) {
    -        elementVar.value.set(arr.get(i, elementVar.dataType))
    -        if (indexVar.isDefined) {
    -          indexVar.get.value.set(i)
    -        }
    -        result.update(i, f.eval(input))
    -        i += 1
    +  override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
    +    val arr = inputValue.asInstanceOf[ArrayData]
    +    val f = functionForEval
    +    val result = new GenericArrayData(new Array[Any](arr.numElements))
    +    var i = 0
    +    while (i < arr.numElements) {
    +      elementVar.value.set(arr.get(i, elementVar.dataType))
    +      if (indexVar.isDefined) {
    +        indexVar.get.value.set(i)
           }
    -      result
    +      result.update(i, f.eval(inputRow))
    +      i += 1
         }
    +    result
       }
     
       override def prettyName: String = "transform"
     }
     
    +/**
    + * Filters entries in a map using the provided function.
    + */
    +@ExpressionDescription(
    +usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
    +examples = """
    +    Examples:
    +      > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
    +       [1 -> 0, 3 -> -1]
    +  """,
    +since = "2.4.0")
    +case class MapFilter(
    +    input: Expression,
    +    function: Expression)
    +  extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
    +
    +  @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
    +    case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
    +    case _ =>
    +      val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
    +      (kType, vType, vContainsNull)
    +  }
    --- End diff --
    
    oh, sorry I haven read carefully your comment, now I see what you meant. Yes, I agree unifying them in a Helper object. I am updating accordingly. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r207954320
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -123,7 +125,10 @@ trait HigherOrderFunction extends Expression {
       }
     }
     
    -trait ArrayBasedHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
    +/**
    + * Trait for functions having as input one argument and one function.
    + */
    +trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
    --- End diff --
    
    We use the term `Unary` a lot and this is different from the other uses. The name should convey a HigherOrderFunction that only uses a single (lambda) function right? The only thing I can come up with is `SingleHigherOrderFunction`. `Simple` would probably also be fine.
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r207808648
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -123,7 +125,10 @@ trait HigherOrderFunction extends Expression {
       }
     }
     
    -trait ArrayBasedHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
    +/**
    + * Trait for functions having as input one argument and one function.
    + */
    +trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
    --- End diff --
    
    I called it `Unary` as it gets one input and one function. Honestly I can't think of a better name without becoming very verbose. if you have a better suggestion I am happy to follow it. I will add the `nullSafeEval`, thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94280/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94363/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by xuanyuanking <gi...@git.apache.org>.

Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r207924294
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -205,29 +230,82 @@ case class ArrayTransform(
         (elementVar, indexVar)
       }
     
    -  override def eval(input: InternalRow): Any = {
    -    val arr = this.input.eval(input).asInstanceOf[ArrayData]
    -    if (arr == null) {
    -      null
    -    } else {
    -      val f = functionForEval
    -      val result = new GenericArrayData(new Array[Any](arr.numElements))
    -      var i = 0
    -      while (i < arr.numElements) {
    -        elementVar.value.set(arr.get(i, elementVar.dataType))
    -        if (indexVar.isDefined) {
    -          indexVar.get.value.set(i)
    -        }
    -        result.update(i, f.eval(input))
    -        i += 1
    +  override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
    +    val arr = inputValue.asInstanceOf[ArrayData]
    +    val f = functionForEval
    +    val result = new GenericArrayData(new Array[Any](arr.numElements))
    +    var i = 0
    +    while (i < arr.numElements) {
    +      elementVar.value.set(arr.get(i, elementVar.dataType))
    +      if (indexVar.isDefined) {
    +        indexVar.get.value.set(i)
           }
    -      result
    +      result.update(i, f.eval(inputRow))
    +      i += 1
         }
    +    result
       }
     
       override def prettyName: String = "transform"
     }
     
    +/**
    + * Filters entries in a map using the provided function.
    + */
    +@ExpressionDescription(
    +usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
    +examples = """
    +    Examples:
    +      > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
    +       [1 -> 0, 3 -> -1]
    +  """,
    +since = "2.4.0")
    +case class MapFilter(
    +    input: Expression,
    +    function: Expression)
    +  extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
    +
    +  @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
    --- End diff --
    
    Maybe this should be a function in object MapBasedUnaryHigherOrderFunction, we can use it in other map based higher order function just like using ArrayBasedHigherOrderFunction.elementArgumentType.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by ueshin <gi...@git.apache.org>.

Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r207702649
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -123,7 +125,10 @@ trait HigherOrderFunction extends Expression {
       }
     }
     
    -trait ArrayBasedHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
    +/**
    + * Trait for functions having as input one argument and one function.
    + */
    +trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
    --- End diff --
    
    Btw, how about defining `nullSafeEval` for `input` in this trait like `UnaryExpression`? (`nullInputSafeEval`?)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94273/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94289 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94289/testReport)** for PR 21986 at commit [`b58a1de`](https://github.com/apache/spark/commit/b58a1dec715a26aa8bd53efa102342afff44a896).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by mn-mikke <gi...@git.apache.org>.

Github user mn-mikke commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r207908454
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -205,29 +230,85 @@ case class ArrayTransform(
         (elementVar, indexVar)
       }
     
    -  override def eval(input: InternalRow): Any = {
    -    val arr = this.input.eval(input).asInstanceOf[ArrayData]
    -    if (arr == null) {
    -      null
    -    } else {
    -      val f = functionForEval
    -      val result = new GenericArrayData(new Array[Any](arr.numElements))
    -      var i = 0
    -      while (i < arr.numElements) {
    -        elementVar.value.set(arr.get(i, elementVar.dataType))
    -        if (indexVar.isDefined) {
    -          indexVar.get.value.set(i)
    -        }
    -        result.update(i, f.eval(input))
    -        i += 1
    +  override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
    +    val arr = inputValue.asInstanceOf[ArrayData]
    +    val f = functionForEval
    +    val result = new GenericArrayData(new Array[Any](arr.numElements))
    +    var i = 0
    +    while (i < arr.numElements) {
    +      elementVar.value.set(arr.get(i, elementVar.dataType))
    +      if (indexVar.isDefined) {
    +        indexVar.get.value.set(i)
           }
    -      result
    +      result.update(i, f.eval(inputRow))
    +      i += 1
         }
    +    result
       }
     
       override def prettyName: String = "transform"
     }
     
    +/**
    + * Filters entries in a map using the provided function.
    + */
    +@ExpressionDescription(
    +usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
    +examples = """
    +    Examples:
    +      > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
    +       [1 -> 0, 3 -> -1]
    +  """,
    +since = "2.4.0")
    +case class MapFilter(
    +    input: Expression,
    +    function: Expression)
    +  extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
    +
    +  @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
    +    case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
    +    case _ =>
    +      val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
    +      (kType, vType, vContainsNull)
    +  }
    +
    +  @transient lazy val (keyVar, valueVar) = {
    +    val args = function.asInstanceOf[LambdaFunction].arguments
    +    (args.head.asInstanceOf[NamedLambdaVariable], args.tail.head.asInstanceOf[NamedLambdaVariable])
    +  }
    +
    +  override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction): MapFilter = {
    +    function match {
    +      case LambdaFunction(_, _, _) =>
    --- End diff --
    
    Is this pattern matching necessary? If so, shouldn't ```ArrayFilter``` use it as well?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94273 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94273/testReport)** for PR 21986 at commit [`37e221c`](https://github.com/apache/spark/commit/37e221c2eb79ec43e61ed2b4a61f206100eaeb42).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94291 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94291/testReport)** for PR 21986 at commit [`9c25ae6`](https://github.com/apache/spark/commit/9c25ae66b7fd0e3d5f11e3e097af32ef72a55e76).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94371/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    cc @ueshin 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1851/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94371/testReport)** for PR 21986 at commit [`af79644`](https://github.com/apache/spark/commit/af79644cb4687b6acb9a10548f05aef980f1882a).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by ueshin <gi...@git.apache.org>.

Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r208170769
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -205,29 +230,82 @@ case class ArrayTransform(
         (elementVar, indexVar)
       }
     
    -  override def eval(input: InternalRow): Any = {
    -    val arr = this.input.eval(input).asInstanceOf[ArrayData]
    -    if (arr == null) {
    -      null
    -    } else {
    -      val f = functionForEval
    -      val result = new GenericArrayData(new Array[Any](arr.numElements))
    -      var i = 0
    -      while (i < arr.numElements) {
    -        elementVar.value.set(arr.get(i, elementVar.dataType))
    -        if (indexVar.isDefined) {
    -          indexVar.get.value.set(i)
    -        }
    -        result.update(i, f.eval(input))
    -        i += 1
    +  override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
    +    val arr = inputValue.asInstanceOf[ArrayData]
    +    val f = functionForEval
    +    val result = new GenericArrayData(new Array[Any](arr.numElements))
    +    var i = 0
    +    while (i < arr.numElements) {
    +      elementVar.value.set(arr.get(i, elementVar.dataType))
    +      if (indexVar.isDefined) {
    +        indexVar.get.value.set(i)
           }
    -      result
    +      result.update(i, f.eval(inputRow))
    +      i += 1
         }
    +    result
       }
     
       override def prettyName: String = "transform"
     }
     
    +/**
    + * Filters entries in a map using the provided function.
    + */
    +@ExpressionDescription(
    +usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
    +examples = """
    +    Examples:
    +      > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
    +       [1 -> 0, 3 -> -1]
    +  """,
    +since = "2.4.0")
    +case class MapFilter(
    +    input: Expression,
    +    function: Expression)
    +  extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
    +
    +  @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
    +    case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
    +    case _ =>
    +      val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
    +      (kType, vType, vContainsNull)
    +  }
    --- End diff --
    
    How about:
    
    1. rename `ArrayBasedHigherOrderFunction` object to `HigherOrderFunction`
    1. rename `elementArgumentType` method to `arrayElementArgumentType`
    1. move `keyValueArgumentType` to `HigherOrderFunction` object and rename to `mapKeyValueArgumentType`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94141/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94199 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94199/testReport)** for PR 21986 at commit [`3f88e2a`](https://github.com/apache/spark/commit/3f88e2a927c22f4fc509b8ca96027ef381f7fe84).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes `
      * `trait ArrayBasedUnaryHigherOrderFunction extends UnaryHigherOrderFunction `
      * `trait MapBasedUnaryHigherOrderFunction extends UnaryHigherOrderFunction `
      * `case class MapFilter(`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1772/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1908/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94280 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94280/testReport)** for PR 21986 at commit [`37e221c`](https://github.com/apache/spark/commit/37e221c2eb79ec43e61ed2b4a61f206100eaeb42).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by ueshin <gi...@git.apache.org>.

Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r208169432
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -205,29 +230,82 @@ case class ArrayTransform(
         (elementVar, indexVar)
       }
     
    -  override def eval(input: InternalRow): Any = {
    -    val arr = this.input.eval(input).asInstanceOf[ArrayData]
    -    if (arr == null) {
    -      null
    -    } else {
    -      val f = functionForEval
    -      val result = new GenericArrayData(new Array[Any](arr.numElements))
    -      var i = 0
    -      while (i < arr.numElements) {
    -        elementVar.value.set(arr.get(i, elementVar.dataType))
    -        if (indexVar.isDefined) {
    -          indexVar.get.value.set(i)
    -        }
    -        result.update(i, f.eval(input))
    -        i += 1
    +  override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
    +    val arr = inputValue.asInstanceOf[ArrayData]
    +    val f = functionForEval
    +    val result = new GenericArrayData(new Array[Any](arr.numElements))
    +    var i = 0
    +    while (i < arr.numElements) {
    +      elementVar.value.set(arr.get(i, elementVar.dataType))
    +      if (indexVar.isDefined) {
    +        indexVar.get.value.set(i)
           }
    -      result
    +      result.update(i, f.eval(inputRow))
    +      i += 1
         }
    +    result
       }
     
       override def prettyName: String = "transform"
     }
     
    +/**
    + * Filters entries in a map using the provided function.
    + */
    +@ExpressionDescription(
    +usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
    +examples = """
    +    Examples:
    +      > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
    +       [1 -> 0, 3 -> -1]
    +  """,
    +since = "2.4.0")
    +case class MapFilter(
    +    input: Expression,
    +    function: Expression)
    +  extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
    +
    +  @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
    +    case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
    +    case _ =>
    +      val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
    +      (kType, vType, vContainsNull)
    +  }
    --- End diff --
    
    Hmm, something wrong with introducing object to have util methods?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94367 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94367/testReport)** for PR 21986 at commit [`af79644`](https://github.com/apache/spark/commit/af79644cb4687b6acb9a10548f05aef980f1882a).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94371 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94371/testReport)** for PR 21986 at commit [`af79644`](https://github.com/apache/spark/commit/af79644cb4687b6acb9a10548f05aef980f1882a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94363 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94363/testReport)** for PR 21986 at commit [`1823fb2`](https://github.com/apache/spark/commit/1823fb279b1e5ed7b55d6e27ede27982ce94d922).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `trait SimpleHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes `
      * `trait ArrayBasedSimpleHigherOrderFunction extends SimpleHigherOrderFunction `
      * `trait MapBasedSimpleHigherOrderFunction extends SimpleHigherOrderFunction `


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94367 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94367/testReport)** for PR 21986 at commit [`af79644`](https://github.com/apache/spark/commit/af79644cb4687b6acb9a10548f05aef980f1882a).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94367/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94291/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by ueshin <gi...@git.apache.org>.

Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21986#discussion_r207702606
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
    @@ -123,7 +125,10 @@ trait HigherOrderFunction extends Expression {
       }
     }
     
    -trait ArrayBasedHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
    +/**
    + * Trait for functions having as input one argument and one function.
    + */
    +trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
    --- End diff --
    
    I like this trait but I'm not sure whether we can say `"Unary"HigherOrderFunction` for this.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1762/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94289 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94289/testReport)** for PR 21986 at commit [`b58a1de`](https://github.com/apache/spark/commit/b58a1dec715a26aa8bd53efa102342afff44a896).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94170 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94170/testReport)** for PR 21986 at commit [`3f88e2a`](https://github.com/apache/spark/commit/3f88e2a927c22f4fc509b8ca96027ef381f7fe84).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes `
      * `trait ArrayBasedUnaryHigherOrderFunction extends UnaryHigherOrderFunction `
      * `trait MapBasedUnaryHigherOrderFunction extends UnaryHigherOrderFunction `
      * `case class MapFilter(`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1791/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1838/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    **[Test build #94363 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94363/testReport)** for PR 21986 at commit [`1823fb2`](https://github.com/apache/spark/commit/1823fb279b1e5ed7b55d6e27ede27982ce94d922).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21986
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org