You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2015/11/10 16:33:55 UTC

[GitHub] spark pull request: [SPARK-11578][SQL][follow-up][WIP] complete th...

GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/9599

    [SPARK-11578][SQL][follow-up][WIP] complete the user facing api for typed aggregation

    Currently the user facing api for typed aggregation has some limitations:
    
    * the customized typed aggregation must be the first of aggregation list
    * the customized typed aggregation can only use long as buffer type
    * the customized typed aggregation can only use flat type as result type
    
    This PR tries to remove these limitations.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark agg

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9599.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9599
    
----
commit f247f243efc766dc6a70ed8ee323e4eb2716ec87
Author: Wenchen Fan <we...@databricks.com>
Date:   2015-11-10T15:02:44Z

    complete the typed aggregate

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11578][SQL][follow-up] complete the use...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/9599


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11578][SQL][follow-up] complete the use...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/9599#issuecomment-155455267
  
    cc @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11578][SQL][follow-up] complete the use...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/9599#issuecomment-155535483
  
    Thanks, I'm going to merge this and keep iterating.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11578][SQL][follow-up] complete the use...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9599#issuecomment-155457144
  
    **[Test build #45530 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45530/consoleFull)** for PR 9599 at commit [`f247f24`](https://github.com/apache/spark/commit/f247f243efc766dc6a70ed8ee323e4eb2716ec87).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11578][SQL][follow-up] complete the use...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9599#issuecomment-155502239
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11578][SQL][follow-up] complete the use...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9599#issuecomment-155501552
  
    **[Test build #45530 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45530/consoleFull)** for PR 9599 at commit [`f247f24`](https://github.com/apache/spark/commit/f247f243efc766dc6a70ed8ee323e4eb2716ec87).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11578][SQL][follow-up] complete the use...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9599#discussion_r44444404
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/Aggregator.scala ---
    @@ -58,6 +58,11 @@ abstract class Aggregator[-A, B, C] {
       def reduce(b: B, a: A): B
     
       /**
    +   * Merge two intermediate values
    +   */
    +  def merge(b1: B, b2: B): B
    --- End diff --
    
    @mateiz JFYI, I missed adding this in the first iteration, but I think we need it if we are going to do multilevel aggregation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11578][SQL][follow-up] complete the use...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9599#issuecomment-155502248
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45530/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11578][SQL][follow-up] complete the use...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9599#issuecomment-155456116
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11578][SQL][follow-up] complete the use...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9599#discussion_r44444756
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TypedAggregateExpression.scala ---
    @@ -93,32 +96,51 @@ case class TypedAggregateExpression(
           case a: AttributeReference => inputMapping(a)
         })
     
    -  // TODO: this probably only works when we are in the first column.
       val bAttributes = bEncoder.schema.toAttributes
       lazy val boundB = bEncoder.resolve(bAttributes).bind(bAttributes)
     
    +  private def updateBuffer(buffer: MutableRow, value: InternalRow): Unit = {
    +    // todo: need a more neat way to assign the value.
    +    var i = 0
    +    while (i < aggBufferAttributes.length) {
    +      aggBufferSchema(i).dataType match {
    +        case IntegerType => buffer.setInt(mutableAggBufferOffset + i, value.getInt(i))
    +        case LongType => buffer.setLong(mutableAggBufferOffset + i, value.getLong(i))
    --- End diff --
    
    Maybe something like `BufferSetterGetterUtils`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11578][SQL][follow-up] complete the use...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9599#issuecomment-155456060
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org