You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by hvanhovell <gi...@git.apache.org> on 2016/05/03 19:02:59 UTC

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

GitHub user hvanhovell opened a pull request:

    https://github.com/apache/spark/pull/12874

    [SPARK-10605][SQL] Create native collect_list/collect_set aggregates

    ## What changes were proposed in this pull request?
    We currently use the Hive implementations for the collect_list/collect_set aggregate functions. This has a few major drawbacks, the use of HiveUDAF (which has quite a bit of overhead) and the lack of support for struct datatypes. This PR adds native implementation of these functions to Spark.
    
    The size of the collected list/set vary, this means we cannot use the fast, Tungsten, aggregation path to perform the aggregation, and that we fallback to the slower sort based path. Another big issue with these operators is that when the size of the collected list/set grows too large, we can start experiencing large GC pauzes and OOMEs.
    
    This `collect*` aggregates implemented in this PR rely on the sort based aggregate path for correctness. They maintain their own internal buffer which holds the rows for one group at a time. The sortbased aggregation path is triggered by disabling `partialAggregation` for the aggregates (which is kinda funny); this technique is also employed in `org.apache.spark.sql.hiveHiveUDAFFunction`.
    
    I have done some performance testing:
    ```scala
    import org.apache.spark.sql.{Dataset, Row}
    
    sql("create function collect_list2 as 'org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCollectList'")
    
    val df = range(0, 10000000).select($"id", (rand(213123L) * 100000).cast("int").as("grp"))
    df.select(countDistinct($"grp")).show
    
    def benchmark(name: String, plan: Dataset[Row], maxItr: Int = 5): Unit = {
       // Do not measure planning.
       plan1.queryExecution.executedPlan
    
       // Execute the plan a number of times and average the result.
       val start = System.nanoTime
       var i = 0
       while (i < maxItr) {
         plan.rdd.foreach(row => Unit)
         i += 1
       }
       val time = (System.nanoTime - start) / (maxItr * 1000000L)
       println(s"[$name] $maxItr iterations completed in an average time of $time ms.")
    }
    
    val plan1 = df.groupBy($"grp").agg(collect_list($"id"))
    val plan2 = df.groupBy($"grp").agg(callUDF("collect_list2", $"id"))
    
    benchmark("Spark collect_list", plan1)
    ...
    > [Spark collect_list] 5 iterations completed in an average time of 3371 ms.
    
    benchmark("Hive collect_list", plan2)
    ...
    > [Hive collect_list] 5 iterations completed in an average time of 9109 ms.
    ```
    Performance is improved by a factor 2-3.
    
    ## How was this patch tested?
    Added tests to `DataFrameAggregateSuite`.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hvanhovell/spark implode

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/12874.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #12874
    
----
commit 9d8aeed7b18d58258097526f85f07c37a10c28d8
Author: Herman van Hovell <hv...@questtec.nl>
Date:   2016-02-01T15:15:16Z

    Add native collect_set/collect_list.

commit 8247d8eb6416b7b5d4eb61fa585c595d87930d63
Author: Herman van Hovell <hv...@questtec.nl>
Date:   2016-02-01T15:43:24Z

    Add test for struct types.

commit 326a213dc014403aef1033e9d39206f5a873b7a6
Author: Herman van Hovell <hv...@questtec.nl>
Date:   2016-02-01T18:14:10Z

    Add pretty names for SQL generation.

commit 9d3205d211405a598a4f24b2b5fef62cf0538f6e
Author: Herman van Hovell <hv...@questtec.nl>
Date:   2016-05-03T16:32:37Z

    Merge remote-tracking branch 'apache-github/master' into implode
    
    # Conflicts:
    #	sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
    #	sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala

commit da6a13151e01d1789a18090a734ad512c56aecc5
Author: Herman van Hovell <hv...@questtec.nl>
Date:   2016-05-03T16:34:01Z

    Merge remote-tracking branch 'apache-github/master' into implode

commit 8a4e7827d6b1f4c150ec29c35185e63c974762dd
Author: Herman van Hovell <hv...@questtec.nl>
Date:   2016-05-03T18:32:22Z

    Merge remote-tracking branch 'apache-github/master' into implode
    
    # Conflicts:
    #	sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
    #	sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala

commit 597f76bb350126bb3360b692720426a6d41c3c18
Author: Herman van Hovell <hv...@questtec.nl>
Date:   2016-05-03T18:48:39Z

    Remove hardcoded Hive references.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12874#issuecomment-216653429
  
    **[Test build #57657 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57657/consoleFull)** for PR 12874 at commit [`597f76b`](https://github.com/apache/spark/commit/597f76bb350126bb3360b692720426a6d41c3c18).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12874#issuecomment-216658335
  
    **[Test build #57665 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57665/consoleFull)** for PR 12874 at commit [`d9dedff`](https://github.com/apache/spark/commit/d9dedffc6d2a90a140264e89d971486c5a850dda).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12874#issuecomment-216653630
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57657/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by maver1ck <gi...@git.apache.org>.

Github user maver1ck commented on the pull request:

    https://github.com/apache/spark/pull/12874#issuecomment-218849616
  
    There is one more thing.
    We observed that collect_list doesn't work in Spark 2.0
    https://issues.apache.org/jira/browse/SPARK-15293


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12874#discussion_r63096674
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala ---
    @@ -0,0 +1,119 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.expressions.aggregate
    +
    +import scala.collection.generic.Growable
    +import scala.collection.mutable
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.util.GenericArrayData
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.sql.types._
    +
    +/**
    + * The Collect aggregate function collects all seen expression values into a list of values.
    + *
    + * The operator is bound to the slower sort based aggregation path because the number of
    + * elements (and their memory usage) can not be determined in advance. This also means that the
    + * collected elements are stored on heap, and that too many elements can cause GC pauses and
    + * eventually Out of Memory Errors.
    + */
    +abstract class Collect extends ImperativeAggregate {
    +
    +  val child: Expression
    +
    +  override def children: Seq[Expression] = child :: Nil
    +
    +  override def nullable: Boolean = true
    +
    +  override def dataType: DataType = ArrayType(child.dataType)
    +
    +  override def inputTypes: Seq[AbstractDataType] = Seq(AnyDataType)
    +
    +  override def supportsPartial: Boolean = false
    +
    +  override def aggBufferAttributes: Seq[AttributeReference] = Nil
    +
    +  override def aggBufferSchema: StructType = StructType.fromAttributes(aggBufferAttributes)
    +
    +  override def inputAggBufferAttributes: Seq[AttributeReference] = Nil
    +
    +  protected[this] val buffer: Growable[Any] with Iterable[Any]
    +
    +  override def initialize(b: MutableRow): Unit = {
    +    buffer.clear()
    +  }
    +
    +  override def update(b: MutableRow, input: InternalRow): Unit = {
    +    buffer += child.eval(input)
    +  }
    +
    +  override def merge(buffer: MutableRow, input: InternalRow): Unit = {
    +    sys.error("Collect cannot be used in partial aggregations.")
    +  }
    +
    +  override def eval(input: InternalRow): Any = {
    +    new GenericArrayData(buffer.toArray)
    --- End diff --
    
    oh, I see. We have an internal buffer and only generate this array data when we call eval.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12874#discussion_r63096013
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
    @@ -219,20 +219,4 @@ private[sql] class HiveSessionCatalog(
             }
         }
       }
    -
    -  // Pre-load a few commonly used Hive built-in functions.
    -  HiveSessionCatalog.preloadedHiveBuiltinFunctions.foreach {
    -    case (functionName, clazz) =>
    -      val builder = makeFunctionBuilder(functionName, clazz)
    -      val info = new ExpressionInfo(clazz.getCanonicalName, functionName)
    -      createTempFunction(functionName, info, builder, ignoreIfExists = false)
    -  }
    -}
    -
    -private[sql] object HiveSessionCatalog {
    -  // This is the list of Hive's built-in functions that are commonly used and we want to
    -  // pre-load when we create the FunctionRegistry.
    -  val preloadedHiveBuiltinFunctions =
    -    ("collect_set", classOf[org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCollectSet]) ::
    -    ("collect_list", classOf[org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCollectList]) :: Nil
    --- End diff --
    
    nice


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/12874


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/12874#issuecomment-218883020
  
    I'm going to merge this in master/2.0. Thanks!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/12874#issuecomment-218884957
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by maver1ck <gi...@git.apache.org>.

Github user maver1ck commented on the pull request:

    https://github.com/apache/spark/pull/12874#issuecomment-218721890
  
    Hi,
    What about this patch ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12874#issuecomment-216653615
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12874#issuecomment-216681334
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57665/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12874#issuecomment-216681332
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on the pull request:

    https://github.com/apache/spark/pull/12874#issuecomment-216632661
  
    cc @mccheah I rebased and simplified my PR, could you take a look. This is an alternative to https://github.com/apache/spark/pull/11688. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/12874#issuecomment-218838929
  
    It's probably a good idea to have a native implementation so we don't need to fall back to Hive, and then improve on that in the future to have a sort-based approach.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on the pull request:

    https://github.com/apache/spark/pull/12874#issuecomment-218726402
  
    I am not sure if there is enough support to add this to Spark. The thing is that this is a potential source OOME's and that it banks on the specific behavior of the `SortBasedAggregate` code path; this is however the same for Hive's `collect*` functions and this PR **is** an improvement over its `Hive` counterparts. Do you have a very pressing usecase for this?
    
    A more fruitfull approach would be to implement a dedicated operator for this. This would eliminate the reliance on the `SortBasedAggregate`, but it would still be capable of causing OOME's (we could spill the elements to disk, but the resulting row still has to fit into main memory).
    
    @rxin what is your take on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12874#issuecomment-216633371
  
    **[Test build #57657 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57657/consoleFull)** for PR 12874 at commit [`597f76b`](https://github.com/apache/spark/commit/597f76bb350126bb3360b692720426a6d41c3c18).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12874#issuecomment-216681116
  
    **[Test build #57665 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57665/consoleFull)** for PR 12874 at commit [`d9dedff`](https://github.com/apache/spark/commit/d9dedffc6d2a90a140264e89d971486c5a850dda).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org