You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by codeatri <gi...@git.apache.org> on 2018/08/08 19:35:58 UTC
[GitHub] spark pull request #22045: [SPARK-23939][SQL] Add transform_values SQL funct...
GitHub user codeatri opened a pull request:
https://github.com/apache/spark/pull/22045
[SPARK-23939][SQL] Add transform_values SQL function
## What changes were proposed in this pull request?
This pr adds `transform_values` function which applies the function to each entry of the map and transforms the values.
```javascript
> SELECT transform_values(map(array(1, 2, 3), array(1, 2, 3), (k,v) -> v + 1);
map(1->2, 2->3, 3->4)
> SELECT transform_keys(map(array(1, 2, 3), array(1, 2, 3), (k,v) -> k + v);
map(1->2, 2->4, 3->6)
```
## How was this patch tested?
New Tests added to
`DataFrameFunctionsSuite`
`HigherOrderFunctionsSuite`
`SQLQueryTestSuite`
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/codeatri/spark SPARK-23940
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22045.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22045
----
commit 68392e31d86f26663fbb8e5badac82b356081f47
Author: codeatri <ne...@...>
Date: 2018-08-08T18:42:36Z
Added transform_values function
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/22045
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22045
**[Test build #94783 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94783/testReport)** for PR 22045 at commit [`b73106d`](https://github.com/apache/spark/commit/b73106d43000972ab9adae3d3b463a0dada2b9cc).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23939][SQL] Add transform_values SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22045
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22045
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210469472
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -497,6 +497,53 @@ case class ArrayAggregate(
override def prettyName: String = "aggregate"
}
+/**
+ * Returns a map that applies the function to each value of the map.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Transforms values in the map using the function.",
--- End diff --
nit: indent
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22045
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94783/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22045
**[Test build #94843 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94843/testReport)** for PR 22045 at commit [`56d08ef`](https://github.com/apache/spark/commit/56d08ef37531f8e25ae2c7fe3996cf7657384a80).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22045
**[Test build #94864 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94864/testReport)** for PR 22045 at commit [`3382e1a`](https://github.com/apache/spark/commit/3382e1a5396c8e5a94802d92a7106eacf627617c).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/22045
Thanks! merging to master.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210165373
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HigherOrderFunctionsSuite.scala ---
@@ -283,6 +289,61 @@ class HigherOrderFunctionsSuite extends SparkFunSuite with ExpressionEvalHelper
15)
}
+ test("TransformValues") {
+ val ai0 = Literal.create(
+ Map(1 -> 1, 2 -> 2, 3 -> 3),
+ MapType(IntegerType, IntegerType))
+ val ai1 = Literal.create(
+ Map(1 -> 1, 2 -> null, 3 -> 3),
+ MapType(IntegerType, IntegerType))
+ val ain = Literal.create(
+ Map.empty[Int, Int],
+ MapType(IntegerType, IntegerType))
--- End diff --
Can you add tests for `Literal.create(null, MapType(IntegerType, IntegerType))`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22045
**[Test build #94827 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94827/testReport)** for PR 22045 at commit [`daf7935`](https://github.com/apache/spark/commit/daf793599a6da5c11dbc4a6bd6e5dea3e0d47afd).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/22045
@codeatri Could you fix the conflicts please? Thanks!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by mn-mikke <gi...@git.apache.org>.
Github user mn-mikke commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210561102
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -497,6 +497,53 @@ case class ArrayAggregate(
override def prettyName: String = "aggregate"
}
+/**
+ * Returns a map that applies the function to each value of the map.
+ */
+@ExpressionDescription(
+ usage = "_FUNC_(expr, func) - Transforms values in the map using the function.",
+ examples = """
+ Examples:
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3)), (k, v) -> v + 1);
+ map(array(1, 2, 3), array(2, 3, 4))
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + v);
+ map(array(1, 2, 3), array(2, 4, 6))
+ """,
+ since = "2.4.0")
+case class TransformValues(
+ argument: Expression,
+ function: Expression)
+ extends MapBasedSimpleHigherOrderFunction with CodegenFallback {
+
+ override def nullable: Boolean = argument.nullable
+
+ @transient lazy val MapType(keyType, valueType, valueContainsNull) = argument.dataType
+
+ override def dataType: DataType = MapType(keyType, function.dataType, valueContainsNull)
--- End diff --
Shouldn't the ```dataType``` be defined as ```MapType(keyType, function.dataType, function.nullable)```?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210164879
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -497,6 +497,60 @@ case class ArrayAggregate(
override def prettyName: String = "aggregate"
}
+/**
+ * Returns a map that applies the function to each value of the map.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Transforms values in the map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3), (k, v) -> v + 1);
--- End diff --
nit: we need one more right parenthesis after the second `array(1, 2, 3)`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210470513
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -497,6 +497,53 @@ case class ArrayAggregate(
override def prettyName: String = "aggregate"
}
+/**
+ * Returns a map that applies the function to each value of the map.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Transforms values in the map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3)), (k, v) -> v + 1);
+ map(array(1, 2, 3), array(2, 3, 4))
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + v);
+ map(array(1, 2, 3), array(2, 4, 6))
+ """,
+since = "2.4.0")
+case class TransformValues(
+ argument: Expression,
+ function: Expression)
+ extends MapBasedSimpleHigherOrderFunction with CodegenFallback {
+
+ override def nullable: Boolean = argument.nullable
+
+ @transient lazy val MapType(keyType, valueType, valueContainsNull) = argument.dataType
+
+ override def dataType: DataType = MapType(keyType, function.dataType, valueContainsNull)
+
+ override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction)
+ : TransformValues = {
+ copy(function = f(function, (keyType, false) :: (valueType, valueContainsNull) :: Nil))
+ }
+
+ @transient lazy val LambdaFunction(
+ _, (keyVar: NamedLambdaVariable) :: (valueVar: NamedLambdaVariable) :: Nil, _) = function
--- End diff --
nit: indent
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210469510
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -497,6 +497,53 @@ case class ArrayAggregate(
override def prettyName: String = "aggregate"
}
+/**
+ * Returns a map that applies the function to each value of the map.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Transforms values in the map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3)), (k, v) -> v + 1);
+ map(array(1, 2, 3), array(2, 3, 4))
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + v);
+ map(array(1, 2, 3), array(2, 4, 6))
+ """,
+since = "2.4.0")
--- End diff --
ditto.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210469494
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -497,6 +497,53 @@ case class ArrayAggregate(
override def prettyName: String = "aggregate"
}
+/**
+ * Returns a map that applies the function to each value of the map.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Transforms values in the map using the function.",
+examples = """
--- End diff --
ditto.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by codeatri <gi...@git.apache.org>.
Github user codeatri commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210402976
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -497,6 +497,60 @@ case class ArrayAggregate(
override def prettyName: String = "aggregate"
}
+/**
+ * Returns a map that applies the function to each value of the map.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Transforms values in the map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3), (k, v) -> v + 1);
--- End diff --
@ueshin Thanks for the review! and yes I agree, I made the same mistakes in both the PR's. I was waiting for the transform_key to converge so that I can make the same changes here.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210165448
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala ---
@@ -2302,6 +2302,210 @@ class DataFrameFunctionsSuite extends QueryTest with SharedSQLContext {
assert(ex5.getMessage.contains("function map_zip_with does not support ordering on type map"))
}
+ test("transform values function - test various primitive data types combinations") {
--- End diff --
We don't need so many cases here. We only need to verify the api works end to end.
Evaluation checks of the function should be in `HigherOrderFunctionsSuite`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23939][SQL] Add transform_values SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22045
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210471011
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala ---
@@ -2302,6 +2302,177 @@ class DataFrameFunctionsSuite extends QueryTest with SharedSQLContext {
assert(ex5.getMessage.contains("function map_zip_with does not support ordering on type map"))
}
+ test("transform values function - test primitive data types") {
+ val dfExample1 = Seq(
+ Map[Int, Int](1 -> 1, 9 -> 9, 8 -> 8, 7 -> 7)
+ ).toDF("i")
+
+ val dfExample2 = Seq(
+ Map[Boolean, String](false -> "abc", true -> "def")
+ ).toDF("x")
+
+ val dfExample3 = Seq(
+ Map[String, Int]("a" -> 1, "b" -> 2, "c" -> 3)
+ ).toDF("y")
+
+ val dfExample4 = Seq(
+ Map[Int, Double](1 -> 1.0, 2 -> 1.40, 3 -> 1.70)
+ ).toDF("z")
+
+ val dfExample5 = Seq(
+ Map[Int, Array[Int]](1 -> Array(1, 2))
+ ).toDF("c")
+
+ def testMapOfPrimitiveTypesCombination(): Unit = {
+ checkAnswer(dfExample1.selectExpr("transform_values(i, (k, v) -> k + v)"),
+ Seq(Row(Map(1 -> 2, 9 -> 18, 8 -> 16, 7 -> 14))))
+
+ checkAnswer(dfExample2.selectExpr(
+ "transform_values(x, (k, v) -> if(k, v, CAST(k AS String)))"),
+ Seq(Row(Map(false -> "false", true -> "def"))))
+
+ checkAnswer(dfExample2.selectExpr("transform_values(x, (k, v) -> NOT k AND v = 'abc')"),
+ Seq(Row(Map(false -> true, true -> false))))
+
+ checkAnswer(dfExample3.selectExpr("transform_values(y, (k, v) -> v * v)"),
+ Seq(Row(Map("a" -> 1, "b" -> 4, "c" -> 9))))
+
+ checkAnswer(dfExample3.selectExpr(
+ "transform_values(y, (k, v) -> k || ':' || CAST(v as String))"),
+ Seq(Row(Map("a" -> "a:1", "b" -> "b:2", "c" -> "c:3"))))
+
+ checkAnswer(
+ dfExample3.selectExpr("transform_values(y, (k, v) -> concat(k, cast(v as String)))"),
+ Seq(Row(Map("a" -> "a1", "b" -> "b2", "c" -> "c3"))))
+
+ checkAnswer(
+ dfExample4.selectExpr(
+ "transform_values(" +
+ "z,(k, v) -> map_from_arrays(ARRAY(1, 2, 3), " +
+ "ARRAY('one', 'two', 'three'))[k] || '_' || CAST(v AS String))"),
+ Seq(Row(Map(1 -> "one_1.0", 2 -> "two_1.4", 3 ->"three_1.7"))))
+
+ checkAnswer(
+ dfExample4.selectExpr("transform_values(z, (k, v) -> k-v)"),
+ Seq(Row(Map(1 -> 0.0, 2 -> 0.6000000000000001, 3 -> 1.3))))
+
+ checkAnswer(
+ dfExample5.selectExpr("transform_values(c, (k, v) -> k + cardinality(v))"),
+ Seq(Row(Map(1 -> 3))))
+ }
+
+ // Test with local relation, the Project will be evaluated without codegen
+ testMapOfPrimitiveTypesCombination()
+ dfExample1.cache()
+ dfExample2.cache()
+ dfExample3.cache()
+ dfExample4.cache()
+ dfExample5.cache()
+ // Test with cached relation, the Project will be evaluated with codegen
+ testMapOfPrimitiveTypesCombination()
+ }
+
+ test("transform values function - test empty") {
+ val dfExample1 = Seq(
+ Map.empty[Integer, Integer]
+ ).toDF("i")
+
+ val dfExample2 = Seq(
+ Map.empty[BigInt, String]
+ ).toDF("j")
+
+ def testEmpty(): Unit = {
+ checkAnswer(dfExample1.selectExpr("transform_values(i, (k, v) -> NULL)"),
+ Seq(Row(Map.empty[Integer, Integer])))
+
+ checkAnswer(dfExample1.selectExpr("transform_values(i, (k, v) -> k)"),
+ Seq(Row(Map.empty[Integer, Integer])))
+
+ checkAnswer(dfExample1.selectExpr("transform_values(i, (k, v) -> v)"),
+ Seq(Row(Map.empty[Integer, Integer])))
+
+ checkAnswer(dfExample1.selectExpr("transform_values(i, (k, v) -> 0)"),
+ Seq(Row(Map.empty[Integer, Integer])))
+
+ checkAnswer(dfExample1.selectExpr("transform_values(i, (k, v) -> 'value')"),
+ Seq(Row(Map.empty[Integer, String])))
+
+ checkAnswer(dfExample1.selectExpr("transform_values(i, (k, v) -> true)"),
+ Seq(Row(Map.empty[Integer, Boolean])))
+
+ checkAnswer(dfExample2.selectExpr("transform_values(j, (k, v) -> k + cast(v as BIGINT))"),
+ Seq(Row(Map.empty[BigInt, BigInt])))
+ }
+
+ testEmpty()
+ dfExample1.cache()
+ dfExample2.cache()
+ testEmpty()
+ }
+
+ test("transform values function - test null values") {
+ val dfExample1 = Seq(
+ Map[Int, Integer](1 -> 1, 2 -> 2, 3 -> 3, 4 -> 4)
+ ).toDF("a")
+
+ val dfExample2 = Seq(
+ Map[Int, String](1 -> "a", 2 -> "b", 3 -> null)
+ ).toDF("b")
+
+ def testNullValue(): Unit = {
+ checkAnswer(dfExample1.selectExpr("transform_values(a, (k, v) -> null)"),
+ Seq(Row(Map[Int, Integer](1 -> null, 2 -> null, 3 -> null, 4 -> null))))
+
+ checkAnswer(dfExample2.selectExpr(
+ "transform_values(b, (k, v) -> IF(v IS NULL, k + 1, k + 2))"),
+ Seq(Row(Map(1 -> 3, 2 -> 4, 3 -> 4))))
+ }
+
+ testNullValue()
+ dfExample1.cache()
+ dfExample2.cache()
+ testNullValue()
+ }
+
+ test("transform values function - test invalid functions") {
+ val dfExample1 = Seq(
+ Map[Int, Int](1 -> 1, 9 -> 9, 8 -> 8, 7 -> 7)
+ ).toDF("i")
+
+ val dfExample2 = Seq(
+ Map[String, String]("a" -> "b")
+ ).toDF("j")
+
+ val dfExample3 = Seq(
+ Seq(1, 2, 3, 4)
+ ).toDF("x")
+
+ def testInvalidLambdaFunctions(): Unit = {
+
+ val ex1 = intercept[AnalysisException] {
+ dfExample1.selectExpr("transform_values(i, k -> k )")
--- End diff --
nit: remove an extra space after `k -> k`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by mn-mikke <gi...@git.apache.org>.
Github user mn-mikke commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r208746631
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -442,3 +442,61 @@ case class ArrayAggregate(
override def prettyName: String = "aggregate"
}
+
+/**
+ * Transform Values for every entry of the map by applying transform_values function.
+ * Returns map wth transformed values
--- End diff --
typos: Transforms values; with
Maybe can you think of a better comment?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210164955
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -497,6 +497,60 @@ case class ArrayAggregate(
override def prettyName: String = "aggregate"
}
+/**
+ * Returns a map that applies the function to each value of the map.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Transforms values in the map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3), (k, v) -> v + 1);
+ map(array(1, 2, 3), array(2, 3, 4))
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3), (k, v) -> k + v);
+ map(array(1, 2, 3), array(2, 4, 6))
+ """,
+since = "2.4.0")
+case class TransformValues(
+ argument: Expression,
+ function: Expression)
+ extends MapBasedSimpleHigherOrderFunction with CodegenFallback {
+
+ override def nullable: Boolean = argument.nullable
+
+ override def dataType: DataType = {
+ val map = argument.dataType.asInstanceOf[MapType]
+ MapType(map.keyType, function.dataType, function.nullable)
+ }
+
+ @transient val MapType(keyType, valueType, valueContainsNull) = argument.dataType
--- End diff --
`lazy val`?
Could you add a test when argument is not a map in invalid cases of `DataFrameFunctionsSuite`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/22045
ok to test.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210165194
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HigherOrderFunctionsSuite.scala ---
@@ -95,6 +95,12 @@ class HigherOrderFunctionsSuite extends SparkFunSuite with ExpressionEvalHelper
aggregate(expr, zero, merge, identity)
}
+ def transformValues(expr: Expression, f: (Expression, Expression) => Expression): Expression = {
+ val valueType = expr.dataType.asInstanceOf[MapType].valueType
+ val keyType = expr.dataType.asInstanceOf[MapType].keyType
+ TransformValues(expr, createLambda(keyType, false, valueType, true, f))
--- End diff --
We should use `valueContainsNull` instead of `true`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210165225
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HigherOrderFunctionsSuite.scala ---
@@ -283,6 +289,61 @@ class HigherOrderFunctionsSuite extends SparkFunSuite with ExpressionEvalHelper
15)
}
+ test("TransformValues") {
+ val ai0 = Literal.create(
+ Map(1 -> 1, 2 -> 2, 3 -> 3),
+ MapType(IntegerType, IntegerType))
--- End diff --
Can you add `valueContainsNull` explicitly?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by mn-mikke <gi...@git.apache.org>.
Github user mn-mikke commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r208751629
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -442,3 +442,61 @@ case class ArrayAggregate(
override def prettyName: String = "aggregate"
}
+
+/**
+ * Transform Values for every entry of the map by applying transform_values function.
+ * Returns map wth transformed values
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Transforms values in the map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3), (k,v) -> k + 1);
+ map(array(1, 2, 3), array(2, 3, 4))
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3), (k, v) -> k + v);
+ map(array(1, 2, 3), array(2, 4, 6))
+ """,
+since = "2.4.0")
+case class TransformValues(
+ input: Expression,
+ function: Expression)
+ extends MapBasedSimpleHigherOrderFunction with CodegenFallback {
+
+ override def nullable: Boolean = input.nullable
+
+ override def dataType: DataType = {
+ val map = input.dataType.asInstanceOf[MapType]
+ MapType(map.keyType, function.dataType, map.valueContainsNull)
+ }
+
+ override def inputTypes: Seq[AbstractDataType] = Seq(MapType, expectingFunctionType)
+
+ @transient val (keyType, valueType, valueContainsNull) =
+ HigherOrderFunction.mapKeyValueArgumentType(input.dataType)
+
+ override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction):
--- End diff --
nit: formatting
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by mn-mikke <gi...@git.apache.org>.
Github user mn-mikke commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r208747953
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -442,3 +442,61 @@ case class ArrayAggregate(
override def prettyName: String = "aggregate"
}
+
+/**
+ * Transform Values for every entry of the map by applying transform_values function.
+ * Returns map wth transformed values
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Transforms values in the map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3), (k,v) -> k + 1);
--- End diff --
nit:```(k, v)``` and maybe I would use ```v + 1``` instead of ```k + 1```.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22045
**[Test build #94864 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94864/testReport)** for PR 22045 at commit [`3382e1a`](https://github.com/apache/spark/commit/3382e1a5396c8e5a94802d92a7106eacf627617c).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22045
**[Test build #94827 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94827/testReport)** for PR 22045 at commit [`daf7935`](https://github.com/apache/spark/commit/daf793599a6da5c11dbc4a6bd6e5dea3e0d47afd).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210165543
--- Diff: sql/core/src/test/resources/sql-tests/inputs/higher-order-functions.sql ---
@@ -51,3 +51,17 @@ select exists(ys, y -> y > 30) as v from nested;
-- Check for element existence in a null array
select exists(cast(null as array<int>), y -> y > 30) as v;
+
+create or replace temporary view nested as values
+ (1, map(1,1,2,2,3,3)),
+ (2, map(4,4,5,5,6,6))
--- End diff --
nit:
```
(1, map(1, 1, 2, 2, 3, 3)),
(2, map(4, 4, 5, 5, 6, 6))
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22045
**[Test build #94843 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94843/testReport)** for PR 22045 at commit [`56d08ef`](https://github.com/apache/spark/commit/56d08ef37531f8e25ae2c7fe3996cf7657384a80).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22045
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94843/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22045
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by mn-mikke <gi...@git.apache.org>.
Github user mn-mikke commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r208750446
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -442,3 +442,61 @@ case class ArrayAggregate(
override def prettyName: String = "aggregate"
}
+
+/**
+ * Transform Values for every entry of the map by applying transform_values function.
+ * Returns map wth transformed values
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Transforms values in the map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3), (k,v) -> k + 1);
+ map(array(1, 2, 3), array(2, 3, 4))
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3), (k, v) -> k + v);
+ map(array(1, 2, 3), array(2, 4, 6))
+ """,
+since = "2.4.0")
+case class TransformValues(
+ input: Expression,
+ function: Expression)
+ extends MapBasedSimpleHigherOrderFunction with CodegenFallback {
+
+ override def nullable: Boolean = input.nullable
+
+ override def dataType: DataType = {
+ val map = input.dataType.asInstanceOf[MapType]
+ MapType(map.keyType, function.dataType, map.valueContainsNull)
+ }
+
+ override def inputTypes: Seq[AbstractDataType] = Seq(MapType, expectingFunctionType)
--- End diff --
This is already specified by ```MapBasedSimpleHigherOrderFunction```.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22045
**[Test build #94783 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94783/testReport)** for PR 22045 at commit [`b73106d`](https://github.com/apache/spark/commit/b73106d43000972ab9adae3d3b463a0dada2b9cc).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22045
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210164976
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -497,6 +497,60 @@ case class ArrayAggregate(
override def prettyName: String = "aggregate"
}
+/**
+ * Returns a map that applies the function to each value of the map.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Transforms values in the map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3), (k, v) -> v + 1);
+ map(array(1, 2, 3), array(2, 3, 4))
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3), (k, v) -> k + v);
+ map(array(1, 2, 3), array(2, 4, 6))
+ """,
+since = "2.4.0")
+case class TransformValues(
+ argument: Expression,
+ function: Expression)
+ extends MapBasedSimpleHigherOrderFunction with CodegenFallback {
+
+ override def nullable: Boolean = argument.nullable
+
+ override def dataType: DataType = {
+ val map = argument.dataType.asInstanceOf[MapType]
+ MapType(map.keyType, function.dataType, function.nullable)
--- End diff --
We can use `keyType` from the following val?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22045
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23939][SQL] Add transform_values SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22045
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22045
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94827/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22045
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22045
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94864/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by mn-mikke <gi...@git.apache.org>.
Github user mn-mikke commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r208749197
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -442,3 +442,61 @@ case class ArrayAggregate(
override def prettyName: String = "aggregate"
}
+
+/**
+ * Transform Values for every entry of the map by applying transform_values function.
+ * Returns map wth transformed values
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Transforms values in the map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3), (k,v) -> k + 1);
+ map(array(1, 2, 3), array(2, 3, 4))
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3), (k, v) -> k + v);
+ map(array(1, 2, 3), array(2, 4, 6))
+ """,
+since = "2.4.0")
+case class TransformValues(
+ input: Expression,
+ function: Expression)
+ extends MapBasedSimpleHigherOrderFunction with CodegenFallback {
+
+ override def nullable: Boolean = input.nullable
+
+ override def dataType: DataType = {
+ val map = input.dataType.asInstanceOf[MapType]
+ MapType(map.keyType, function.dataType, map.valueContainsNull)
--- End diff --
```map.valueContainsNull``` -> ```function.nullable```?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22045#discussion_r210165102
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala ---
@@ -497,6 +497,60 @@ case class ArrayAggregate(
override def prettyName: String = "aggregate"
}
+/**
+ * Returns a map that applies the function to each value of the map.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(expr, func) - Transforms values in the map using the function.",
+examples = """
+ Examples:
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3), (k, v) -> v + 1);
+ map(array(1, 2, 3), array(2, 3, 4))
+ > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3), (k, v) -> k + v);
+ map(array(1, 2, 3), array(2, 4, 6))
+ """,
+since = "2.4.0")
+case class TransformValues(
+ argument: Expression,
+ function: Expression)
+ extends MapBasedSimpleHigherOrderFunction with CodegenFallback {
+
+ override def nullable: Boolean = argument.nullable
+
+ override def dataType: DataType = {
+ val map = argument.dataType.asInstanceOf[MapType]
+ MapType(map.keyType, function.dataType, function.nullable)
+ }
+
+ @transient val MapType(keyType, valueType, valueContainsNull) = argument.dataType
+
+ override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction)
+ : TransformValues = {
+ copy(function = f(function, (keyType, false) :: (valueType, valueContainsNull) :: Nil))
+ }
+
+ @transient lazy val (keyVar, valueVar) = {
+ val LambdaFunction(
+ _, (keyVar: NamedLambdaVariable) :: (valueVar: NamedLambdaVariable) :: Nil, _) = function
+ (keyVar, valueVar)
+ }
--- End diff --
nit: how about:
```scala
@transient lazy val LambdaFunction(_,
(keyVar: NamedLambdaVariable) :: (valueVar: NamedLambdaVariable) :: Nil, _) = function
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org