You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mgaido91 <gi...@git.apache.org> on 2018/08/03 15:21:14 UTC
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
GitHub user mgaido91 opened a pull request:
https://github.com/apache/spark/pull/21986
[SPARK-23937][SQL] Add map_filter SQL function
## What changes were proposed in this pull request?
The PR adds the high order function `map_filter`, which filters the entries of a map and returns a new map which contains only the entries which satisfied the filter function.
## How was this patch tested?
added UTs
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mgaido91/spark SPARK-23937
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21986.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21986
----
commit 3f88e2a927c22f4fc509b8ca96027ef381f7fe84
Author: Marco Gaido <ma...@...>
Date: 2018-08-03T15:16:11Z
[SPARK-23937][SQL] Add map_filter SQL function
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r207965203
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -205,29 +230,82 @@ case class ArrayTransform(
(elementVar, indexVar)
}
- override def eval(input: InternalRow): Any = {
- val arr = this.input.eval(input).asInstanceOf[ArrayData]
- if (arr == null) {
- null
- } else {
- val f = functionForEval
- val result = new GenericArrayData(new Array[Any](arr.numElements))
- var i = 0
- while (i < arr.numElements) {
- elementVar.value.set(arr.get(i, elementVar.dataType))
- if (indexVar.isDefined) {
- indexVar.get.value.set(i)
- }
- result.update(i, f.eval(input))
- i += 1
+ override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
+ val arr = inputValue.asInstanceOf[ArrayData]
+ val f = functionForEval
+ val result = new GenericArrayData(new Array[Any](arr.numElements))
+ var i = 0
+ while (i < arr.numElements) {
+ elementVar.value.set(arr.get(i, elementVar.dataType))
+ if (indexVar.isDefined) {
+ indexVar.get.value.set(i)
}
- result
+ result.update(i, f.eval(inputRow))
+ i += 1
}
+ result
}
override def prettyName: String = "transform"
}
+/**
+ * Filters entries in a map using the provided function.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
+ [1 -> 0, 3 -> -1]
+ """,
+since = "2.4.0")
+case class MapFilter(
+ input: Expression,
+ function: Expression)
+ extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
+
+ @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
+ case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
+ case _ =>
+ val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
+ (kType, vType, vContainsNull)
+ }
--- End diff --
Sorry, I meant something like:
```scala
object MapBasedUnaryHigherOrderFunction {
def keyValueArgumentType(dt: DataType): (DataType, DataType, Boolean) = {
dt match {
case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
case _ =>
val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
(kType, vType, vContainsNull)
}
}
}
...
case class MapFilter( ... ) {
...
@transient val (keyType, valueType, valueContainsNull) =
MapBasedUnaryHigherOrderFunction.keyValueArgumentType(input.dataType)
...
}
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r207703208
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -210,3 +221,66 @@ case class ArrayTransform(
override def prettyName: String = "transform"
}
+
+/**
+ * Filters entries in a map using the provided function.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
+ [1 -> 0, 3 -> -1]
+ """,
+since = "2.4.0")
+case class MapFilter(
+ input: Expression,
+ function: Expression)
+ extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
+
+ @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
+ case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
+ case _ =>
+ val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
+ (kType, vType, vContainsNull)
+ }
+
+ @transient lazy val (keyVar, valueVar) = {
+ val args = function.asInstanceOf[LambdaFunction].arguments
+ (args.head.asInstanceOf[NamedLambdaVariable], args.tail.head.asInstanceOf[NamedLambdaVariable])
+ }
+
+ override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction): MapFilter = {
+ function match {
+ case LambdaFunction(_, _, _) =>
+ copy(function = f(function, (keyType, false) :: (valueType, valueContainsNull) :: Nil))
+ }
+ }
+
+ override def nullable: Boolean = input.nullable
+
+ override def eval(input: InternalRow): Any = {
+ val m = this.input.eval(input).asInstanceOf[MapData]
+ if (m == null) {
+ null
+ } else {
+ val retKeys = new mutable.ListBuffer[Any]
+ val retValues = new mutable.ListBuffer[Any]
--- End diff --
I'm just curious that `ListBuffer` is better than `ArrayBuffer`? If so, should we rewrite in `ArrayFilter`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94272 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94272/testReport)** for PR 21986 at commit [`9bbaa3b`](https://github.com/apache/spark/commit/9bbaa3b18493fe5e77652b7f39bdc5a6732771bb).
* This patch passes all tests.
* This patch **does not merge cleanly**.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r207921448
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -205,29 +230,85 @@ case class ArrayTransform(
(elementVar, indexVar)
}
- override def eval(input: InternalRow): Any = {
- val arr = this.input.eval(input).asInstanceOf[ArrayData]
- if (arr == null) {
- null
- } else {
- val f = functionForEval
- val result = new GenericArrayData(new Array[Any](arr.numElements))
- var i = 0
- while (i < arr.numElements) {
- elementVar.value.set(arr.get(i, elementVar.dataType))
- if (indexVar.isDefined) {
- indexVar.get.value.set(i)
- }
- result.update(i, f.eval(input))
- i += 1
+ override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
+ val arr = inputValue.asInstanceOf[ArrayData]
+ val f = functionForEval
+ val result = new GenericArrayData(new Array[Any](arr.numElements))
+ var i = 0
+ while (i < arr.numElements) {
+ elementVar.value.set(arr.get(i, elementVar.dataType))
+ if (indexVar.isDefined) {
+ indexVar.get.value.set(i)
}
- result
+ result.update(i, f.eval(inputRow))
+ i += 1
}
+ result
}
override def prettyName: String = "transform"
}
+/**
+ * Filters entries in a map using the provided function.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
+ [1 -> 0, 3 -> -1]
+ """,
+since = "2.4.0")
+case class MapFilter(
+ input: Expression,
+ function: Expression)
+ extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
+
+ @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
+ case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
+ case _ =>
+ val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
+ (kType, vType, vContainsNull)
+ }
+
+ @transient lazy val (keyVar, valueVar) = {
+ val args = function.asInstanceOf[LambdaFunction].arguments
+ (args.head.asInstanceOf[NamedLambdaVariable], args.tail.head.asInstanceOf[NamedLambdaVariable])
+ }
+
+ override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction): MapFilter = {
+ function match {
+ case LambdaFunction(_, _, _) =>
--- End diff --
right, I am removing it, thanks
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94272 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94272/testReport)** for PR 21986 at commit [`9bbaa3b`](https://github.com/apache/spark/commit/9bbaa3b18493fe5e77652b7f39bdc5a6732771bb).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94280 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94280/testReport)** for PR 21986 at commit [`37e221c`](https://github.com/apache/spark/commit/37e221c2eb79ec43e61ed2b4a61f206100eaeb42).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r207816072
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -210,3 +221,66 @@ case class ArrayTransform(
override def prettyName: String = "transform"
}
+
+/**
+ * Filters entries in a map using the provided function.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
+ [1 -> 0, 3 -> -1]
+ """,
+since = "2.4.0")
+case class MapFilter(
+ input: Expression,
+ function: Expression)
+ extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
+
+ @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
+ case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
+ case _ =>
+ val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
+ (kType, vType, vContainsNull)
+ }
+
+ @transient lazy val (keyVar, valueVar) = {
+ val args = function.asInstanceOf[LambdaFunction].arguments
+ (args.head.asInstanceOf[NamedLambdaVariable], args.tail.head.asInstanceOf[NamedLambdaVariable])
+ }
+
+ override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction): MapFilter = {
+ function match {
+ case LambdaFunction(_, _, _) =>
+ copy(function = f(function, (keyType, false) :: (valueType, valueContainsNull) :: Nil))
+ }
+ }
+
+ override def nullable: Boolean = input.nullable
+
+ override def eval(input: InternalRow): Any = {
+ val m = this.input.eval(input).asInstanceOf[MapData]
+ if (m == null) {
+ null
+ } else {
+ val retKeys = new mutable.ListBuffer[Any]
+ val retValues = new mutable.ListBuffer[Any]
--- End diff --
But I just checked that in `ArrayFilter` you initialized it with the number of incoming elements. So i think there is no difference in terms of performance, as using an upper value for the number of output elements we are sure no copy is performed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r207702742
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HigherOrderFunctionsSuite.scala ---
@@ -94,4 +94,53 @@ class HigherOrderFunctionsSuite extends SparkFunSuite with ExpressionEvalHelper
checkEvaluation(transform(aai, array => Cast(transform(array, plusIndex), StringType)),
Seq("[1, 3, 5]", null, "[4, 6]"))
}
+
+ test("MapFilter") {
+ def mapFilter(expr: Expression, f: (Expression, Expression) => Expression): Expression = {
+ val mt = expr.dataType.asInstanceOf[MapType]
+ MapFilter(expr, createLambda(mt.keyType, false, mt.valueType, mt.valueContainsNull, f))
+ }
+ val mii0 = Literal.create(Map(1 -> 0, 2 -> 10, 3 -> -1),
+ MapType(IntegerType, IntegerType, valueContainsNull = false))
+ val mii1 = Literal.create(Map(1 -> null, 2 -> 10, 3 -> null),
+ MapType(IntegerType, IntegerType, valueContainsNull = true))
+ val miin = Literal.create(null, MapType(IntegerType, IntegerType, valueContainsNull = false))
+
+ val kGreaterThanV: (Expression, Expression) => Expression = (k, v) => k > v
+
+ checkEvaluation(mapFilter(mii0, kGreaterThanV), Map(1 -> 0, 3 -> -1))
+ checkEvaluation(mapFilter(mii1, kGreaterThanV), Map())
+ checkEvaluation(mapFilter(miin, kGreaterThanV), null)
+
+ val valueNull: (Expression, Expression) => Expression = (_, v) => v.isNull
--- End diff --
nit: `valueIsNull`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94289/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94141 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94141/testReport)** for PR 21986 at commit [`3f88e2a`](https://github.com/apache/spark/commit/3f88e2a927c22f4fc509b8ca96027ef381f7fe84).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes `
* `trait ArrayBasedUnaryHigherOrderFunction extends UnaryHigherOrderFunction `
* `trait MapBasedUnaryHigherOrderFunction extends UnaryHigherOrderFunction `
* `case class MapFilter(`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1911/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21986
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1904/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94170 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94170/testReport)** for PR 21986 at commit [`3f88e2a`](https://github.com/apache/spark/commit/3f88e2a927c22f4fc509b8ca96027ef381f7fe84).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r207923151
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -205,29 +230,82 @@ case class ArrayTransform(
(elementVar, indexVar)
}
- override def eval(input: InternalRow): Any = {
- val arr = this.input.eval(input).asInstanceOf[ArrayData]
- if (arr == null) {
- null
- } else {
- val f = functionForEval
- val result = new GenericArrayData(new Array[Any](arr.numElements))
- var i = 0
- while (i < arr.numElements) {
- elementVar.value.set(arr.get(i, elementVar.dataType))
- if (indexVar.isDefined) {
- indexVar.get.value.set(i)
- }
- result.update(i, f.eval(input))
- i += 1
+ override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
+ val arr = inputValue.asInstanceOf[ArrayData]
+ val f = functionForEval
+ val result = new GenericArrayData(new Array[Any](arr.numElements))
+ var i = 0
+ while (i < arr.numElements) {
+ elementVar.value.set(arr.get(i, elementVar.dataType))
+ if (indexVar.isDefined) {
+ indexVar.get.value.set(i)
}
- result
+ result.update(i, f.eval(inputRow))
+ i += 1
}
+ result
}
override def prettyName: String = "transform"
}
+/**
+ * Filters entries in a map using the provided function.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
+ [1 -> 0, 3 -> -1]
+ """,
+since = "2.4.0")
+case class MapFilter(
+ input: Expression,
+ function: Expression)
+ extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
+
+ @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
+ case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
+ case _ =>
+ val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
+ (kType, vType, vContainsNull)
+ }
--- End diff --
How about extracting this to `object MapBasedUnaryHigherOrderFunction` like array based one? We'll need this in other map based ones.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94141 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94141/testReport)** for PR 21986 at commit [`3f88e2a`](https://github.com/apache/spark/commit/3f88e2a927c22f4fc509b8ca96027ef381f7fe84).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r207813180
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -210,3 +221,66 @@ case class ArrayTransform(
override def prettyName: String = "transform"
}
+
+/**
+ * Filters entries in a map using the provided function.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
+ [1 -> 0, 3 -> -1]
+ """,
+since = "2.4.0")
+case class MapFilter(
+ input: Expression,
+ function: Expression)
+ extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
+
+ @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
+ case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
+ case _ =>
+ val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
+ (kType, vType, vContainsNull)
+ }
+
+ @transient lazy val (keyVar, valueVar) = {
+ val args = function.asInstanceOf[LambdaFunction].arguments
+ (args.head.asInstanceOf[NamedLambdaVariable], args.tail.head.asInstanceOf[NamedLambdaVariable])
+ }
+
+ override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction): MapFilter = {
+ function match {
+ case LambdaFunction(_, _, _) =>
+ copy(function = f(function, (keyType, false) :: (valueType, valueContainsNull) :: Nil))
+ }
+ }
+
+ override def nullable: Boolean = input.nullable
+
+ override def eval(input: InternalRow): Any = {
+ val m = this.input.eval(input).asInstanceOf[MapData]
+ if (m == null) {
+ null
+ } else {
+ val retKeys = new mutable.ListBuffer[Any]
+ val retValues = new mutable.ListBuffer[Any]
--- End diff --
I think it is better as here we are always appending (and then creating an array from it). Appending a value is always O(1) for `ListBuffer`, while in `ArrayBuffer` it is: O(1) if the length of the underlying allocated array is bigger than the number of elements in the list plus one, O(n) otherwise (since it has to create a new array and copy the old one). As the initial value for the length of the underlying array in `ArrayBuffer` is 16, this means that for output values with more than 16 elements `ListBuffer` saves at least one copy.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/21986
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/21986
LGTM.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94291 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94291/testReport)** for PR 21986 at commit [`9c25ae6`](https://github.com/apache/spark/commit/9c25ae66b7fd0e3d5f11e3e097af32ef72a55e76).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1839/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94199/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94170/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r207921840
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -123,7 +125,10 @@ trait HigherOrderFunction extends Expression {
}
}
-trait ArrayBasedHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
+/**
+ * Trait for functions having as input one argument and one function.
+ */
+trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
--- End diff --
cc @hvanhovell for the naming?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94199 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94199/testReport)** for PR 21986 at commit [`3f88e2a`](https://github.com/apache/spark/commit/3f88e2a927c22f4fc509b8ca96027ef381f7fe84).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1849/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94272/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94273 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94273/testReport)** for PR 21986 at commit [`37e221c`](https://github.com/apache/spark/commit/37e221c2eb79ec43e61ed2b4a61f206100eaeb42).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1842/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/21986
Thanks! merging to master.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r208181612
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -205,29 +230,82 @@ case class ArrayTransform(
(elementVar, indexVar)
}
- override def eval(input: InternalRow): Any = {
- val arr = this.input.eval(input).asInstanceOf[ArrayData]
- if (arr == null) {
- null
- } else {
- val f = functionForEval
- val result = new GenericArrayData(new Array[Any](arr.numElements))
- var i = 0
- while (i < arr.numElements) {
- elementVar.value.set(arr.get(i, elementVar.dataType))
- if (indexVar.isDefined) {
- indexVar.get.value.set(i)
- }
- result.update(i, f.eval(input))
- i += 1
+ override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
+ val arr = inputValue.asInstanceOf[ArrayData]
+ val f = functionForEval
+ val result = new GenericArrayData(new Array[Any](arr.numElements))
+ var i = 0
+ while (i < arr.numElements) {
+ elementVar.value.set(arr.get(i, elementVar.dataType))
+ if (indexVar.isDefined) {
+ indexVar.get.value.set(i)
}
- result
+ result.update(i, f.eval(inputRow))
+ i += 1
}
+ result
}
override def prettyName: String = "transform"
}
+/**
+ * Filters entries in a map using the provided function.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
+ [1 -> 0, 3 -> -1]
+ """,
+since = "2.4.0")
+case class MapFilter(
+ input: Expression,
+ function: Expression)
+ extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
+
+ @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
+ case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
+ case _ =>
+ val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
+ (kType, vType, vContainsNull)
+ }
--- End diff --
oh, sorry I haven read carefully your comment, now I see what you meant. Yes, I agree unifying them in a Helper object. I am updating accordingly. Thanks.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r207954320
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -123,7 +125,10 @@ trait HigherOrderFunction extends Expression {
}
}
-trait ArrayBasedHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
+/**
+ * Trait for functions having as input one argument and one function.
+ */
+trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
--- End diff --
We use the term `Unary` a lot and this is different from the other uses. The name should convey a HigherOrderFunction that only uses a single (lambda) function right? The only thing I can come up with is `SingleHigherOrderFunction`. `Simple` would probably also be fine.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r207808648
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -123,7 +125,10 @@ trait HigherOrderFunction extends Expression {
}
}
-trait ArrayBasedHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
+/**
+ * Trait for functions having as input one argument and one function.
+ */
+trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
--- End diff --
I called it `Unary` as it gets one input and one function. Honestly I can't think of a better name without becoming very verbose. if you have a better suggestion I am happy to follow it. I will add the `nullSafeEval`, thanks!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94280/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94363/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r207924294
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -205,29 +230,82 @@ case class ArrayTransform(
(elementVar, indexVar)
}
- override def eval(input: InternalRow): Any = {
- val arr = this.input.eval(input).asInstanceOf[ArrayData]
- if (arr == null) {
- null
- } else {
- val f = functionForEval
- val result = new GenericArrayData(new Array[Any](arr.numElements))
- var i = 0
- while (i < arr.numElements) {
- elementVar.value.set(arr.get(i, elementVar.dataType))
- if (indexVar.isDefined) {
- indexVar.get.value.set(i)
- }
- result.update(i, f.eval(input))
- i += 1
+ override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
+ val arr = inputValue.asInstanceOf[ArrayData]
+ val f = functionForEval
+ val result = new GenericArrayData(new Array[Any](arr.numElements))
+ var i = 0
+ while (i < arr.numElements) {
+ elementVar.value.set(arr.get(i, elementVar.dataType))
+ if (indexVar.isDefined) {
+ indexVar.get.value.set(i)
}
- result
+ result.update(i, f.eval(inputRow))
+ i += 1
}
+ result
}
override def prettyName: String = "transform"
}
+/**
+ * Filters entries in a map using the provided function.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
+ [1 -> 0, 3 -> -1]
+ """,
+since = "2.4.0")
+case class MapFilter(
+ input: Expression,
+ function: Expression)
+ extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
+
+ @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
--- End diff --
Maybe this should be a function in object MapBasedUnaryHigherOrderFunction, we can use it in other map based higher order function just like using ArrayBasedHigherOrderFunction.elementArgumentType.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/21986
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r207702649
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -123,7 +125,10 @@ trait HigherOrderFunction extends Expression {
}
}
-trait ArrayBasedHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
+/**
+ * Trait for functions having as input one argument and one function.
+ */
+trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
--- End diff --
Btw, how about defining `nullSafeEval` for `input` in this trait like `UnaryExpression`? (`nullInputSafeEval`?)
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94273/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94289 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94289/testReport)** for PR 21986 at commit [`b58a1de`](https://github.com/apache/spark/commit/b58a1dec715a26aa8bd53efa102342afff44a896).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by mn-mikke <gi...@git.apache.org>.
Github user mn-mikke commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r207908454
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -205,29 +230,85 @@ case class ArrayTransform(
(elementVar, indexVar)
}
- override def eval(input: InternalRow): Any = {
- val arr = this.input.eval(input).asInstanceOf[ArrayData]
- if (arr == null) {
- null
- } else {
- val f = functionForEval
- val result = new GenericArrayData(new Array[Any](arr.numElements))
- var i = 0
- while (i < arr.numElements) {
- elementVar.value.set(arr.get(i, elementVar.dataType))
- if (indexVar.isDefined) {
- indexVar.get.value.set(i)
- }
- result.update(i, f.eval(input))
- i += 1
+ override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
+ val arr = inputValue.asInstanceOf[ArrayData]
+ val f = functionForEval
+ val result = new GenericArrayData(new Array[Any](arr.numElements))
+ var i = 0
+ while (i < arr.numElements) {
+ elementVar.value.set(arr.get(i, elementVar.dataType))
+ if (indexVar.isDefined) {
+ indexVar.get.value.set(i)
}
- result
+ result.update(i, f.eval(inputRow))
+ i += 1
}
+ result
}
override def prettyName: String = "transform"
}
+/**
+ * Filters entries in a map using the provided function.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
+ [1 -> 0, 3 -> -1]
+ """,
+since = "2.4.0")
+case class MapFilter(
+ input: Expression,
+ function: Expression)
+ extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
+
+ @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
+ case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
+ case _ =>
+ val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
+ (kType, vType, vContainsNull)
+ }
+
+ @transient lazy val (keyVar, valueVar) = {
+ val args = function.asInstanceOf[LambdaFunction].arguments
+ (args.head.asInstanceOf[NamedLambdaVariable], args.tail.head.asInstanceOf[NamedLambdaVariable])
+ }
+
+ override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction): MapFilter = {
+ function match {
+ case LambdaFunction(_, _, _) =>
--- End diff --
Is this pattern matching necessary? If so, shouldn't ```ArrayFilter``` use it as well?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94273 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94273/testReport)** for PR 21986 at commit [`37e221c`](https://github.com/apache/spark/commit/37e221c2eb79ec43e61ed2b4a61f206100eaeb42).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94291 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94291/testReport)** for PR 21986 at commit [`9c25ae6`](https://github.com/apache/spark/commit/9c25ae66b7fd0e3d5f11e3e097af32ef72a55e76).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94371/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/21986
cc @ueshin
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1851/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94371/testReport)** for PR 21986 at commit [`af79644`](https://github.com/apache/spark/commit/af79644cb4687b6acb9a10548f05aef980f1882a).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r208170769
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -205,29 +230,82 @@ case class ArrayTransform(
(elementVar, indexVar)
}
- override def eval(input: InternalRow): Any = {
- val arr = this.input.eval(input).asInstanceOf[ArrayData]
- if (arr == null) {
- null
- } else {
- val f = functionForEval
- val result = new GenericArrayData(new Array[Any](arr.numElements))
- var i = 0
- while (i < arr.numElements) {
- elementVar.value.set(arr.get(i, elementVar.dataType))
- if (indexVar.isDefined) {
- indexVar.get.value.set(i)
- }
- result.update(i, f.eval(input))
- i += 1
+ override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
+ val arr = inputValue.asInstanceOf[ArrayData]
+ val f = functionForEval
+ val result = new GenericArrayData(new Array[Any](arr.numElements))
+ var i = 0
+ while (i < arr.numElements) {
+ elementVar.value.set(arr.get(i, elementVar.dataType))
+ if (indexVar.isDefined) {
+ indexVar.get.value.set(i)
}
- result
+ result.update(i, f.eval(inputRow))
+ i += 1
}
+ result
}
override def prettyName: String = "transform"
}
+/**
+ * Filters entries in a map using the provided function.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
+ [1 -> 0, 3 -> -1]
+ """,
+since = "2.4.0")
+case class MapFilter(
+ input: Expression,
+ function: Expression)
+ extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
+
+ @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
+ case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
+ case _ =>
+ val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
+ (kType, vType, vContainsNull)
+ }
--- End diff --
How about:
1. rename `ArrayBasedHigherOrderFunction` object to `HigherOrderFunction`
1. rename `elementArgumentType` method to `arrayElementArgumentType`
1. move `keyValueArgumentType` to `HigherOrderFunction` object and rename to `mapKeyValueArgumentType`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94141/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94199 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94199/testReport)** for PR 21986 at commit [`3f88e2a`](https://github.com/apache/spark/commit/3f88e2a927c22f4fc509b8ca96027ef381f7fe84).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes `
* `trait ArrayBasedUnaryHigherOrderFunction extends UnaryHigherOrderFunction `
* `trait MapBasedUnaryHigherOrderFunction extends UnaryHigherOrderFunction `
* `case class MapFilter(`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1772/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1908/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94280 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94280/testReport)** for PR 21986 at commit [`37e221c`](https://github.com/apache/spark/commit/37e221c2eb79ec43e61ed2b4a61f206100eaeb42).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/21986
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r208169432
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -205,29 +230,82 @@ case class ArrayTransform(
(elementVar, indexVar)
}
- override def eval(input: InternalRow): Any = {
- val arr = this.input.eval(input).asInstanceOf[ArrayData]
- if (arr == null) {
- null
- } else {
- val f = functionForEval
- val result = new GenericArrayData(new Array[Any](arr.numElements))
- var i = 0
- while (i < arr.numElements) {
- elementVar.value.set(arr.get(i, elementVar.dataType))
- if (indexVar.isDefined) {
- indexVar.get.value.set(i)
- }
- result.update(i, f.eval(input))
- i += 1
+ override def nullSafeEval(inputRow: InternalRow, inputValue: Any): Any = {
+ val arr = inputValue.asInstanceOf[ArrayData]
+ val f = functionForEval
+ val result = new GenericArrayData(new Array[Any](arr.numElements))
+ var i = 0
+ while (i < arr.numElements) {
+ elementVar.value.set(arr.get(i, elementVar.dataType))
+ if (indexVar.isDefined) {
+ indexVar.get.value.set(i)
}
- result
+ result.update(i, f.eval(inputRow))
+ i += 1
}
+ result
}
override def prettyName: String = "transform"
}
+/**
+ * Filters entries in a map using the provided function.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Filters entries in a map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
+ [1 -> 0, 3 -> -1]
+ """,
+since = "2.4.0")
+case class MapFilter(
+ input: Expression,
+ function: Expression)
+ extends MapBasedUnaryHigherOrderFunction with CodegenFallback {
+
+ @transient val (keyType, valueType, valueContainsNull) = input.dataType match {
+ case MapType(kType, vType, vContainsNull) => (kType, vType, vContainsNull)
+ case _ =>
+ val MapType(kType, vType, vContainsNull) = MapType.defaultConcreteType
+ (kType, vType, vContainsNull)
+ }
--- End diff --
Hmm, something wrong with introducing object to have util methods?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94367 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94367/testReport)** for PR 21986 at commit [`af79644`](https://github.com/apache/spark/commit/af79644cb4687b6acb9a10548f05aef980f1882a).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94371 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94371/testReport)** for PR 21986 at commit [`af79644`](https://github.com/apache/spark/commit/af79644cb4687b6acb9a10548f05aef980f1882a).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/21986
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94363 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94363/testReport)** for PR 21986 at commit [`1823fb2`](https://github.com/apache/spark/commit/1823fb279b1e5ed7b55d6e27ede27982ce94d922).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `trait SimpleHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes `
* `trait ArrayBasedSimpleHigherOrderFunction extends SimpleHigherOrderFunction `
* `trait MapBasedSimpleHigherOrderFunction extends SimpleHigherOrderFunction `
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94367 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94367/testReport)** for PR 21986 at commit [`af79644`](https://github.com/apache/spark/commit/af79644cb4687b6acb9a10548f05aef980f1882a).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94367/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94291/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/21986#discussion_r207702606
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -123,7 +125,10 @@ trait HigherOrderFunction extends Expression {
}
}
-trait ArrayBasedHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
+/**
+ * Trait for functions having as input one argument and one function.
+ */
+trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes {
--- End diff --
I like this trait but I'm not sure whether we can say `"Unary"HigherOrderFunction` for this.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1762/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94289 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94289/testReport)** for PR 21986 at commit [`b58a1de`](https://github.com/apache/spark/commit/b58a1dec715a26aa8bd53efa102342afff44a896).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94170 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94170/testReport)** for PR 21986 at commit [`3f88e2a`](https://github.com/apache/spark/commit/3f88e2a927c22f4fc509b8ca96027ef381f7fe84).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `trait UnaryHigherOrderFunction extends HigherOrderFunction with ExpectsInputTypes `
* `trait ArrayBasedUnaryHigherOrderFunction extends UnaryHigherOrderFunction `
* `trait MapBasedUnaryHigherOrderFunction extends UnaryHigherOrderFunction `
* `case class MapFilter(`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1791/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1838/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21986
**[Test build #94363 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94363/testReport)** for PR 21986 at commit [`1823fb2`](https://github.com/apache/spark/commit/1823fb279b1e5ed7b55d6e27ede27982ce94d922).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21986: [SPARK-23937][SQL] Add map_filter SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21986
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org