You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by viirya <gi...@git.apache.org> on 2016/02/26 11:29:25 UTC

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

GitHub user viirya opened a pull request:

    https://github.com/apache/spark/pull/11391

    [SPARK-13511][SQL] Add wholestage codegen for limit

    JIRA: https://issues.apache.org/jira/browse/SPARK-13511
    
    ## What changes were proposed in this pull request?
    
    Current limit operator doesn't support wholestage codegen. This is open to add support for it.
    
    ## How was this patch tested?
    
    A test is added into BenchmarkWholeStageCodegen.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/viirya/spark-1 wholestage-limit

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11391.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11391
    
----
commit a213ed141fe313ef43de0722d52966b5eda1a8ce
Author: Liang-Chi Hsieh <vi...@gmail.com>
Date:   2016-02-26T10:27:01Z

    Add wholestage codegen for limit.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189223865
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190613374
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189630683
  
    **[Test build #52118 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52118/consoleFull)** for PR 11391 at commit [`8e69d8d`](https://github.com/apache/spark/commit/8e69d8d3d8e29cddb56b05efab6c4b90f8c8ffd6).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190803769
  
    LGTM, merging this into master, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11391#discussion_r54526775
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/BufferedRowIterator.java ---
    @@ -35,6 +35,8 @@
       // used when there is no column in output
       protected UnsafeRow unsafeRow = new UnsafeRow(0);
     
    +  protected boolean stopEarly = false;
    --- End diff --
    
    yeah. I am updating it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190608007
  
    **[Test build #52219 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52219/consoleFull)** for PR 11391 at commit [`8d254d2`](https://github.com/apache/spark/commit/8d254d206686dd9c6edd053d4abcd184799fcc2a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189223869
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52046/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11391#discussion_r54529927
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala ---
    @@ -78,4 +78,21 @@ class WholeStageCodegenSuite extends SparkPlanTest with SharedSQLContext {
             p.asInstanceOf[WholeStageCodegen].plan.isInstanceOf[Sort]).isDefined)
         assert(df.collect() === Array(Row(1), Row(2), Row(3)))
       }
    +
    +  test("Limit should be included in WholeStageCodegen") {
    +    val df = sqlContext.range(10000).limit(100).sort(col("id"))
    +    val plan = df.queryExecution.executedPlan
    +
    +    assert(plan.find(p =>
    +      p.isInstanceOf[WholeStageCodegen] &&
    +        p.asInstanceOf[WholeStageCodegen].plan.isInstanceOf[Sort] &&
    --- End diff --
    
    Agreed. Let me remove this later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11391#discussion_r54450174
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala ---
    @@ -56,6 +57,28 @@ trait BaseLimit extends UnaryNode {
       protected override def doExecute(): RDD[InternalRow] = child.execute().mapPartitions { iter =>
         iter.take(limit)
       }
    +
    +  override def upstreams(): Seq[RDD[InternalRow]] = {
    +    child.asInstanceOf[CodegenSupport].upstreams()
    +  }
    +
    +  protected override def doProduce(ctx: CodegenContext): String = {
    +    child.asInstanceOf[CodegenSupport].produce(ctx, this)
    +  }
    +
    +  override def doConsume(ctx: CodegenContext, input: Seq[ExprCode]): String = {
    +    val countTerm = ctx.freshName("count")
    +    ctx.addMutableState("int", countTerm, s"$countTerm = 0;")
    +    ctx.currentVars = input
    --- End diff --
    
    We don't need to update `currentVars` here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189231072
  
    Just realized this implementation is wrong. Will fix it later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11391#discussion_r54235113
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala ---
    @@ -43,12 +44,33 @@ case class CollectLimit(limit: Int, child: SparkPlan) extends UnaryNode {
             child.execute(), child.output, SinglePartition, serializer))
         shuffled.mapPartitionsInternal(_.take(limit))
       }
    +
    +  override def upstreams(): Seq[RDD[InternalRow]] = {
    +    child.asInstanceOf[CodegenSupport].upstreams()
    +  }
    +
    +  protected override def doProduce(ctx: CodegenContext): String = {
    +    child.asInstanceOf[CodegenSupport].produce(ctx, this)
    +  }
    +
    +  override def doConsume(ctx: CodegenContext, input: Seq[ExprCode]): String = {
    +    val countTerm = ctx.freshName("count")
    +    ctx.addMutableState("int", countTerm, s"$countTerm = 0;")
    +    ctx.currentVars = input
    +    s"""
    +       | while ($countTerm < $limit) {
    +       |   $countTerm += 1;
    +       |   ${consume(ctx, ctx.currentVars)}
    +       | }
    +       | if (true) return;
    --- End diff --
    
    This is special?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189631136
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189492152
  
    Can you update the pr description to describe how you are supporting it?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190681436
  
    **[Test build #52237 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52237/consoleFull)** for PR 11391 at commit [`3d1e397`](https://github.com/apache/spark/commit/3d1e3972d65ef17e02c21affab97346ebafc156b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190721018
  
    **[Test build #52237 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52237/consoleFull)** for PR 11391 at commit [`3d1e397`](https://github.com/apache/spark/commit/3d1e3972d65ef17e02c21affab97346ebafc156b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11391#discussion_r54525659
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/BufferedRowIterator.java ---
    @@ -35,6 +35,8 @@
       // used when there is no column in output
       protected UnsafeRow unsafeRow = new UnsafeRow(0);
     
    +  protected boolean stopEarly = false;
    --- End diff --
    
    Since `stopEarly` is only accessed generated functions, we don't need this anymore.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11391#discussion_r54525690
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/BufferedRowIterator.java ---
    @@ -35,6 +35,8 @@
       // used when there is no column in output
       protected UnsafeRow unsafeRow = new UnsafeRow(0);
     
    +  protected boolean stopEarly = false;
    --- End diff --
    
    We could use `addMutableState`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189608530
  
    **[Test build #52110 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52110/consoleFull)** for PR 11391 at commit [`9c072aa`](https://github.com/apache/spark/commit/9c072aa3d02460ffb55691a6b0aa82fff8dc0e4c).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class CollectLimit(limit: Int, child: SparkPlan) extends UnaryNode `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190575061
  
    **[Test build #52222 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52222/consoleFull)** for PR 11391 at commit [`b64e52d`](https://github.com/apache/spark/commit/b64e52d189e5041cc1af2ebb0d656c5f5c12c82d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189608563
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/11391


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190526392
  
    **[Test build #52213 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52213/consoleFull)** for PR 11391 at commit [`c887cf4`](https://github.com/apache/spark/commit/c887cf47a36da8d34e33afeca273f415df629fbb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189618165
  
    The failed test `ParquetHadoopFsRelationSuite` is due to the lack of short type support in `UnsafeRowParquetRecordReader`. I submitted another PR #11412 to fix it. The change is also included here to prove the tests can be passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189209869
  
    **[Test build #52046 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52046/consoleFull)** for PR 11391 at commit [`a213ed1`](https://github.com/apache/spark/commit/a213ed141fe313ef43de0722d52966b5eda1a8ce).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11391#discussion_r54529803
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala ---
    @@ -78,4 +78,21 @@ class WholeStageCodegenSuite extends SparkPlanTest with SharedSQLContext {
             p.asInstanceOf[WholeStageCodegen].plan.isInstanceOf[Sort]).isDefined)
         assert(df.collect() === Array(Row(1), Row(2), Row(3)))
       }
    +
    +  test("Limit should be included in WholeStageCodegen") {
    +    val df = sqlContext.range(10000).limit(100).sort(col("id"))
    +    val plan = df.queryExecution.executedPlan
    +
    +    assert(plan.find(p =>
    +      p.isInstanceOf[WholeStageCodegen] &&
    +        p.asInstanceOf[WholeStageCodegen].plan.isInstanceOf[Sort] &&
    --- End diff --
    
    These kind of tests are easy to break, we may don't need this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189323056
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189604296
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189593576
  
    **[Test build #52107 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52107/consoleFull)** for PR 11391 at commit [`8bac699`](https://github.com/apache/spark/commit/8bac699cd526345df8bab3bb89ed9d6d40e5cdf2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189618287
  
    **[Test build #52118 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52118/consoleFull)** for PR 11391 at commit [`8e69d8d`](https://github.com/apache/spark/commit/8e69d8d3d8e29cddb56b05efab6c4b90f8c8ffd6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11391#discussion_r54451555
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/BufferedRowIterator.java ---
    @@ -64,7 +66,16 @@ protected void append(InternalRow row) {
        * If it returns true, the caller should exit the loop (return from processNext()).
        */
       protected boolean shouldStop() {
    -    return !currentRows.isEmpty();
    +    return !currentRows.isEmpty() || stopEarly;
    --- End diff --
    
    I think we could only override `shouldStop()` for Limit, so we won't add a branch when we don't have Limit.
    
    ```
    ctx.addFunction("shouldStop", """
      protected boolean shouldStop() {
        return !currentRows.isEmpty() || stopEarly;
     }""")
    ```
    
    Please be care full that we can't have two `shouldStop()` in generated class.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190612590
  
    **[Test build #52222 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52222/consoleFull)** for PR 11391 at commit [`b64e52d`](https://github.com/apache/spark/commit/b64e52d189e5041cc1af2ebb0d656c5f5c12c82d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189604255
  
    **[Test build #52107 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52107/consoleFull)** for PR 11391 at commit [`8bac699`](https://github.com/apache/spark/commit/8bac699cd526345df8bab3bb89ed9d6d40e5cdf2).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189223746
  
    **[Test build #52046 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52046/consoleFull)** for PR 11391 at commit [`a213ed1`](https://github.com/apache/spark/commit/a213ed141fe313ef43de0722d52966b5eda1a8ce).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class CollectLimit(limit: Int, child: SparkPlan) extends UnaryNode with CodegenSupport `
      * `trait BaseLimit extends UnaryNode with CodegenSupport `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190551827
  
    **[Test build #52213 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52213/consoleFull)** for PR 11391 at commit [`c887cf4`](https://github.com/apache/spark/commit/c887cf47a36da8d34e33afeca273f415df629fbb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189631144
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52118/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190567560
  
    **[Test build #52219 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52219/consoleFull)** for PR 11391 at commit [`8d254d2`](https://github.com/apache/spark/commit/8d254d206686dd9c6edd053d4abcd184799fcc2a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189323060
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52053/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190608354
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52219/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189604297
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52107/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189311425
  
    **[Test build #52053 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52053/consoleFull)** for PR 11391 at commit [`fa38e7c`](https://github.com/apache/spark/commit/fa38e7c5138ad3424fd5b30e3ee6f7ef4d2088e4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190721673
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190721677
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52237/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11391#discussion_r54517293
  
    --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/UnsafeRowParquetRecordReader.java ---
    @@ -765,6 +765,9 @@ private void readIntBatch(int rowId, int num, ColumnVector column) throws IOExce
           } else if (DecimalType.is64BitDecimalType(column.dataType())) {
             defColumn.readIntsAsLongs(
                 num, column, rowId, maxDefLevel, (VectorizedValuesReader) dataColumn);
    +      } else if (column.dataType() == DataTypes.ShortType) {
    --- End diff --
    
    Ah, it has been merged, let me sync with it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11391#discussion_r54529510
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala ---
    @@ -78,4 +78,21 @@ class WholeStageCodegenSuite extends SparkPlanTest with SharedSQLContext {
             p.asInstanceOf[WholeStageCodegen].plan.isInstanceOf[Sort]).isDefined)
         assert(df.collect() === Array(Row(1), Row(2), Row(3)))
       }
    +
    +  test("Limit should be included in WholeStageCodegen") {
    +    val df = sqlContext.range(10000).limit(100).sort(col("id"))
    +    val plan = df.queryExecution.executedPlan
    +
    +    assert(plan.find(p =>
    +      p.isInstanceOf[WholeStageCodegen] &&
    +        p.asInstanceOf[WholeStageCodegen].plan.isInstanceOf[Sort] &&
    --- End diff --
    
    Yeah, because we can't leave limit as last operator otherwise it will transform to collect limit, so I add a sort here. I will remove it once I am back to laptop (few hours later).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190608352
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190551990
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189608565
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52110/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190551991
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52213/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11391#discussion_r54449722
  
    --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/UnsafeRowParquetRecordReader.java ---
    @@ -765,6 +765,9 @@ private void readIntBatch(int rowId, int num, ColumnVector column) throws IOExce
           } else if (DecimalType.is64BitDecimalType(column.dataType())) {
             defColumn.readIntsAsLongs(
                 num, column, rowId, maxDefLevel, (VectorizedValuesReader) dataColumn);
    +      } else if (column.dataType() == DataTypes.ShortType) {
    --- End diff --
    
    Could you pull this out as an separate PR? so we can merge it quickly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11391#discussion_r54528196
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala ---
    @@ -78,4 +78,21 @@ class WholeStageCodegenSuite extends SparkPlanTest with SharedSQLContext {
             p.asInstanceOf[WholeStageCodegen].plan.isInstanceOf[Sort]).isDefined)
         assert(df.collect() === Array(Row(1), Row(2), Row(3)))
       }
    +
    +  test("Limit should be included in WholeStageCodegen") {
    +    val df = sqlContext.range(10000).limit(100).sort(col("id"))
    +    val plan = df.queryExecution.executedPlan
    +
    +    assert(plan.find(p =>
    +      p.isInstanceOf[WholeStageCodegen] &&
    +        p.asInstanceOf[WholeStageCodegen].plan.isInstanceOf[Sort] &&
    --- End diff --
    
    The sort is not related to limit, could you remove it from this PR? (we may revert the commit for sort)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189322966
  
    **[Test build #52053 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52053/consoleFull)** for PR 11391 at commit [`fa38e7c`](https://github.com/apache/spark/commit/fa38e7c5138ad3424fd5b30e3ee6f7ef4d2088e4).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190200340
  
    cc @davies 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-190613392
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52222/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL][WIP] Add wholestage codegen...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189600713
  
    **[Test build #52110 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52110/consoleFull)** for PR 11391 at commit [`9c072aa`](https://github.com/apache/spark/commit/9c072aa3d02460ffb55691a6b0aa82fff8dc0e4c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11391#discussion_r54450500
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/BenchmarkWholeStageCodegen.scala ---
    @@ -70,6 +70,20 @@ class BenchmarkWholeStageCodegen extends SparkFunSuite {
         */
       }
     
    +  ignore("range/limit/sum") {
    +    val N = 500 << 20
    +    runBenchmark("range/limit/sum", N) {
    +      sqlContext.range(N).limit(100).groupBy().sum().collect()
    --- End diff --
    
    For 100 rows, we may can't really see the performance difference, could you increase it to 1 million?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-13511][SQL] Add wholestage codegen for ...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on the pull request:

    https://github.com/apache/spark/pull/11391#issuecomment-189244923
  
    Mark it as WIP?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org