You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by ooq <gi...@git.apache.org> on 2016/07/19 16:20:03 UTC

[GitHub] spark pull request #14266: [SPARK-16526] [SQL] Benchmarking Performance for ...

GitHub user ooq opened a pull request:

    https://github.com/apache/spark/pull/14266

    [SPARK-16526] [SQL] Benchmarking Performance for Fast HashMap Implementations and Set Knobs

    ## What changes were proposed in this pull request?
    
    The 3rd PR in its series to resolve SPARK-16523.
    
    This patch adds benchmark tests for vectorized hashmap vs. row-based hashmap (along with results in the comments). Those tests are ignored by default as they take long to run.
    We would also like to use the results to set the knob which switches between vectorized and row-based hashmap. 
    
    ## How was this patch tested?
    
    This patch are mostly tests itself.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ooq/spark rowbasedfastaggmap-pr3

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14266.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14266
    
----
commit c87f26b318b5d673ac95454df5c1cb9a56c677eb
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-13T07:35:06Z

    add RowBatch and RowBasedHashMapGenerator

commit a3360e0ab1223dd43f891e755e648680a402b7df
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-13T08:08:35Z

    enable row based hashmap

commit 45641e5a7df341522518b19bf4a4662d14d64b48
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-13T08:52:31Z

    fix scale codestyle

commit b94fc6383f0727ce4249653550833fd3f0019a65
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-13T08:53:11Z

    merge fix

commit 9b0b294013239f4db744d7f5f5c1bdf838dd0559
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-13T08:55:53Z

    fix indent

commit 24248b190745bef13c567bd2681164d990d31cf3
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-14T18:18:33Z

    add SimpleRowBatch for performance

commit 9008725af8159ac186e0c7f81b08b85ddd7a0ec7
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-14T18:19:36Z

    a number of minor fixs

commit 4bdaeada70a20f89f6c593a4fc0298597e9a43cd
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-14T18:58:08Z

    Merge branch 'master' of github.com:apache/spark into rowbasedfastaggmap-pr2

commit 225b6619cd070ac9da3846a3bd02fa730e4ec835
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-14T20:53:28Z

    fix bug

commit bb4678856ebc1d729e530b9a1949ca9211c6a92e
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-15T17:43:40Z

    return row

commit a158125956627e502a8045fb077760063a3ca397
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-15T17:45:16Z

    simply fash hash map condition check

commit 22d8afd7dbd187b85e6f0c0d51544f0234d4beac
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-15T17:52:36Z

    update data structures to be consistent with what is used

commit ecff4ff3f30aefbaea89a12d2d5b3fda062b0f38
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-17T23:39:19Z

    Update simple row batch to improve performance & use SimpleRowBatch by default

commit 33b2910fa412669b2460b99ba0b6232f462e7879
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-17T23:57:41Z

    add simplerowbatch

commit 2c1973a872e5b8d99a55234724ec24acbc5f70ff
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-18T07:58:14Z

    Add tests for SimpleRowBatch

commit ce72d900004bfa720460126a3573642a8a97bc53
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-18T08:00:11Z

    keep in sync with pr1

commit 43cf549c27451209fc3fe4c8bb726fcfb2d7501c
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-18T09:59:03Z

    Add benchmarks for comparing hashmaps

commit 6515c3dc8b6f4084f66259f18af362fccb436157
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-18T18:21:01Z

    simply free page in iterator

commit 8f538b177e36ccc5fb690a3b29eb03ca72d1a4b2
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-18T19:00:00Z

    Clean logic in SimpleRowBatch that was supposedly to deal with multiple pages

commit 461028e62c9d9821cf11abdb9d85e9a8edb58ba4
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-18T19:01:36Z

    update with pr1

commit 774e088dc719cbd4d4ef97995656ec912b11878a
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-18T19:02:39Z

    update with pr1

commit 251d3919ed1b7dccacccc9bee6e121954a698cdd
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-18T22:09:47Z

    shrink findOrInsert() code size

commit 708f7bb3790556a596f6de51f127e99cd6f11662
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-18T22:12:29Z

    update some benchmarking results

commit d9394888977c97fe95f1642ad9f613dcbee1e4fa
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-18T22:25:56Z

    remove Rowbatch; renaming SimpleRowBatch to RowBasedKeyValueBatch

commit 02e4ab1c76cc777ef84cacf894f063505a19fffa
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-18T22:26:28Z

    Merge branch 'rowbasedfastaggmap-pr1' of github.com:ooq/spark into rowbasedfastaggmap-pr3

commit 60e78bd477a90892b8568c1da08d7b0e5fe3672a
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-18T22:42:10Z

    update benchmark

commit 20baf3e24699589342e14b1e8f2c90fec85d183b
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-18T23:13:29Z

    update benchmark

commit 3b3c9ea6dfc17ba4ebd562c70608be02f22693f9
Author: Qifan Pu <qi...@gmail.com>
Date:   2016-07-19T16:15:02Z

    Add benchmark results for vectorized vs. rowbased hashmap

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    **[Test build #62945 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62945/consoleFull)** for PR 14266 at commit [`c2b276f`](https://github.com/apache/spark/commit/c2b276f015746f069b55410bc28a47537aceeb3c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62541/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by ooq <gi...@git.apache.org>.
Github user ooq commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    @davies Added some test results with larger number of distinct keys.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    (gentle ping @ooq)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14266: [SPARK-16526][SQL] Benchmarking Performance for F...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/14266


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    **[Test build #62541 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62541/consoleFull)** for PR 14266 at commit [`c984914`](https://github.com/apache/spark/commit/c984914224072ea207645d2ce94a229fe5e5b73e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    **[Test build #63299 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63299/consoleFull)** for PR 14266 at commit [`e990794`](https://github.com/apache/spark/commit/e990794139a3d3d4c66689d4b979553dd04f449f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14266: [SPARK-16526][SQL] Benchmarking Performance for F...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14266#discussion_r73040155
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/AggregateBenchmark.scala ---
    @@ -576,4 +576,605 @@ class AggregateBenchmark extends BenchmarkBase {
         benchmark.run()
       }
     
    -}
    +  // This test does not do any benchmark, instead it produces generated code for vectorized
    +  // and row-based hashmaps.
    +  ignore("generated code comparison for vectorized vs. rowbased") {
    +    val N = 20 << 23
    +
    +    sparkSession.conf.set("spark.sql.codegen.wholeStage", "true")
    +    sparkSession.conf.set("spark.sql.codegen.aggregate.map.columns.max", "30")
    +
    +    sparkSession.range(N)
    +      .selectExpr(
    +        "id & 1023 as k1",
    +        "cast (id & 1023 as string) as k2")
    +      .createOrReplaceTempView("test")
    +
    +    // dataframe/query
    +    val query = sparkSession.sql("select count(k1), sum(k1) from test group by k1, k2")
    +
    +    // vectorized
    +    sparkSession.conf.set("spark.sql.codegen.aggregate.map.enforce.impl", "vectorized")
    +    query.queryExecution.debug.codegen()
    +
    +    // row based
    +    sparkSession.conf.set("spark.sql.codegen.aggregate.map.enforce.impl", "rowbased")
    +    query.queryExecution.debug.codegen()
    +  }
    +
    +  ignore("1 key field, 1 value field, distinct linear keys") {
    +    val N = 20 << 22;
    +
    +    var timeStart: Long = 0L
    +    var timeEnd: Long = 0L
    +    var nsPerRow: Long = 0L
    +    var i = 0
    +    sparkSession.conf.set("spark.sql.codegen.wholeStage", "true")
    +    sparkSession.conf.set("spark.sql.codegen.aggregate.map.columns.max", "30")
    +
    +    // scalastyle:off
    +    println(Benchmark.getJVMOSInfo())
    --- End diff --
    
    nit: to minimize duplication, maybe create a small utility function that can then be reused in all test cases.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    **[Test build #62947 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62947/consoleFull)** for PR 14266 at commit [`4944b29`](https://github.com/apache/spark/commit/4944b29558ba8f3563f1c0c3d1a485b29dcdc39b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62947/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    Just a minor nit. LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    **[Test build #63299 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63299/consoleFull)** for PR 14266 at commit [`e990794`](https://github.com/apache/spark/commit/e990794139a3d3d4c66689d4b979553dd04f449f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63299/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    **[Test build #62945 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62945/consoleFull)** for PR 14266 at commit [`c2b276f`](https://github.com/apache/spark/commit/c2b276f015746f069b55410bc28a47537aceeb3c).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63296/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    **[Test build #63296 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63296/consoleFull)** for PR 14266 at commit [`0875fbc`](https://github.com/apache/spark/commit/0875fbc6013060e67e2b531e2c3c5cccd4cc0e62).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62539/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62945/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    **[Test build #62539 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62539/consoleFull)** for PR 14266 at commit [`3b3c9ea`](https://github.com/apache/spark/commit/3b3c9ea6dfc17ba4ebd562c70608be02f22693f9).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    **[Test build #62947 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62947/consoleFull)** for PR 14266 at commit [`4944b29`](https://github.com/apache/spark/commit/4944b29558ba8f3563f1c0c3d1a485b29dcdc39b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14266: [SPARK-16526][SQL] Benchmarking Performance for F...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14266#discussion_r73928683
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/AggregateBenchmark.scala ---
    @@ -1078,6 +1078,146 @@ class AggregateBenchmark extends BenchmarkBase {
         */
       }
     
    +  ignore("4 key fields, 4 value field, varying linear distinct keys") {
    +    val N = 20 << 22;
    +
    +    var timeStart: Long = 0L
    +    var timeEnd: Long = 0L
    +    var nsPerRow: Long = 0L
    +    var i = 0
    +    sparkSession.conf.set("spark.sql.codegen.wholeStage", "true")
    +    sparkSession.conf.set("spark.sql.codegen.aggregate.map.columns.max", "30")
    +
    +    // scalastyle:off
    +    println(Benchmark.getJVMOSInfo())
    +    println(Benchmark.getProcessorName())
    +    printf("%20s %20s %20s %20s\n", "Num. Distinct Keys", "No Fast Hashmap",
    +      "Vectorized", "Row-based")
    +    // scalastyle:on
    +
    +    val modes = List("skip", "vectorized", "rowbased")
    +
    +    while (i < 17) {
    +      val results = modes.map(mode => {
    +        sparkSession.conf.set("spark.sql.codegen.aggregate.map.enforce.impl", mode)
    +        var j = 0
    +        var minTime: Long = 1000
    +        while (j < 5) {
    +          System.gc()
    +          val s = "id & " + ((1<<i)-1) + " as k"
    +          sparkSession.range(N)
    +            .selectExpr(List.range(0, 4).map(x => s + x): _*)
    +            .createOrReplaceTempView("test")
    +          timeStart = System.nanoTime
    +          sparkSession.sql("select " + List.range(0, 4).map(x => "sum(k" + x + ")").mkString(",") +
    +            " from test group by " + List.range(0, 4).map(x => "k" + x).mkString(",")).collect()
    +          timeEnd = System.nanoTime
    +          nsPerRow = (timeEnd - timeStart) / N
    +          // printf("nsPerRow i=%d j=%d mode=%10s %20s\n", i, j, mode, nsPerRow)
    +          if (j > 1 && minTime > nsPerRow) minTime = nsPerRow
    +          j += 1
    +        }
    +        minTime
    +      })
    +      printf("%20s %20s %20s %20s\n", (1<<i), results(0), results(1), results(2))
    +      i += 1
    +    }
    +    printf("Unit: ns/row\n")
    +
    +    /*
    +    Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14 on Mac OS X 10.11.5
    +    Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
    +
    +      Num. Distinct Keys      No Fast Hashmap           Vectorized            Row-based
    +                       1                   33                   38                   24
    +                       2                   58                   43                   30
    +                       4                   58                   42                   28
    +                       8                   57                   46                   28
    +                      16                   56                   41                   28
    +                      32                   55                   44                   27
    +                      64                   56                   48                   27
    +                     128                   58                   43                   27
    +                     256                   60                   43                   30
    +                     512                   61                   45                   31
    +                    1024                   62                   44                   31
    +                    2048                   64                   42                   38
    +                    4096                   66                   47                   38
    +                    8192                   70                   48                   38
    +                   16384                   72                   48                   42
    +                   32768                   77                   54                   47
    +                   65536                   96                   75                   61
    +                  131072                  115                  119                  130
    +                  262144                  137                  162                  185
    +    Unit: ns/row
    +    */
    +  }
    +
    +  ignore("single key field, single value field, varying linear distinct keys") {
    --- End diff --
    
    Should we access them in random way?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14266: [SPARK-16526][SQL] Benchmarking Performance for F...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14266#discussion_r73616369
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/AggregateBenchmark.scala ---
    @@ -576,4 +576,605 @@ class AggregateBenchmark extends BenchmarkBase {
         benchmark.run()
       }
     
    -}
    +  // This test does not do any benchmark, instead it produces generated code for vectorized
    +  // and row-based hashmaps.
    +  ignore("generated code comparison for vectorized vs. rowbased") {
    +    val N = 20 << 23
    +
    +    sparkSession.conf.set("spark.sql.codegen.wholeStage", "true")
    +    sparkSession.conf.set("spark.sql.codegen.aggregate.map.columns.max", "30")
    +
    +    sparkSession.range(N)
    +      .selectExpr(
    +        "id & 1023 as k1",
    +        "cast (id & 1023 as string) as k2")
    +      .createOrReplaceTempView("test")
    +
    +    // dataframe/query
    +    val query = sparkSession.sql("select count(k1), sum(k1) from test group by k1, k2")
    +
    +    // vectorized
    +    sparkSession.conf.set("spark.sql.codegen.aggregate.map.enforce.impl", "vectorized")
    +    query.queryExecution.debug.codegen()
    +
    +    // row based
    +    sparkSession.conf.set("spark.sql.codegen.aggregate.map.enforce.impl", "rowbased")
    +    query.queryExecution.debug.codegen()
    +  }
    +
    +  ignore("1 key field, 1 value field, distinct linear keys") {
    +    val N = 20 << 22;
    +
    +    var timeStart: Long = 0L
    +    var timeEnd: Long = 0L
    +    var nsPerRow: Long = 0L
    +    var i = 0
    +    sparkSession.conf.set("spark.sql.codegen.wholeStage", "true")
    +    sparkSession.conf.set("spark.sql.codegen.aggregate.map.columns.max", "30")
    +
    +    // scalastyle:off
    +    println(Benchmark.getJVMOSInfo())
    +    println(Benchmark.getProcessorName())
    +    printf("%20s %20s %20s %20s\n", "Num. Distinct Keys", "No Fast Hashmap",
    +      "Vectorized", "Row-based")
    +    // scalastyle:on
    +
    +    val modes = List("skip", "vectorized", "rowbased")
    +
    +    while (i < 15) {
    +      val results = modes.map(mode => {
    +        sparkSession.conf.set("spark.sql.codegen.aggregate.map.enforce.impl", mode)
    +        var j = 0
    +        var minTime: Long = 1000
    +        while (j < 5) {
    +          System.gc()
    +          sparkSession.range(N)
    +            .selectExpr(
    +              "id & " + ((1 << i) - 1) + " as k0")
    +            .createOrReplaceTempView("test")
    +          timeStart = System.nanoTime
    +          sparkSession.sql("select sum(k0)" +
    +            " from test group by k0").collect()
    +          timeEnd = System.nanoTime
    +          nsPerRow = (timeEnd - timeStart) / N
    +          if (j > 1 && minTime > nsPerRow) minTime = nsPerRow
    +          j += 1
    +        }
    +        minTime
    +      })
    +      printf("%20s %20s %20s %20s\n", 1 << i, results(0), results(1), results(2))
    +      i += 1
    +    }
    +    printf("Unit: ns/row\n")
    +
    +    /*
    +    Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14 on Mac OS X 10.11.5
    +    Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
    +
    +      Num. Distinct Keys      No Fast Hashmap           Vectorized            Row-based
    +                       1                   21                   13                   11
    +                       2                   23                   14                   13
    +                       4                   23                   14                   14
    +                       8                   23                   14                   14
    +                      16                   23                   12                   13
    +                      32                   24                   12                   13
    +                      64                   24                   14                   16
    +                     128                   24                   14                   13
    +                     256                   25                   14                   14
    +                     512                   25                   16                   14
    +                    1024                   25                   16                   15
    +                    2048                   26                   12                   15
    +                    4096                   27                   15                   15
    +                    8192                   33                   16                   15
    +                   16384                   34                   15                   15
    +    Unit: ns/row
    +    */
    +  }
    +
    +  ignore("1 key field, 1 value field, distinct random keys") {
    +    val N = 20 << 22;
    +
    +    var timeStart: Long = 0L
    +    var timeEnd: Long = 0L
    +    var nsPerRow: Long = 0L
    +    var i = 0
    +    sparkSession.conf.set("spark.sql.codegen.wholeStage", "true")
    +    sparkSession.conf.set("spark.sql.codegen.aggregate.map.columns.max", "30")
    +
    +    // scalastyle:off
    +    println(Benchmark.getJVMOSInfo())
    +    println(Benchmark.getProcessorName())
    +    printf("%20s %20s %20s %20s\n", "Num. Distinct Keys", "No Fast Hashmap",
    +      "Vectorized", "Row-based")
    +    // scalastyle:on
    +
    +    val modes = List("skip", "vectorized", "rowbased")
    +
    +    while (i < 15) {
    +      val results = modes.map(mode => {
    +        sparkSession.conf.set("spark.sql.codegen.aggregate.map.enforce.impl", mode)
    +        var j = 0
    +        var minTime: Long = 1000
    +        while (j < 5) {
    +          System.gc()
    +          sparkSession.range(N)
    +            .selectExpr(
    +              "cast(floor(rand() * " + (1 << i) + ") as long) as k0")
    +            .createOrReplaceTempView("test")
    +          timeStart = System.nanoTime
    +          sparkSession.sql("select sum(k0)" +
    +            " from test group by k0").collect()
    +          timeEnd = System.nanoTime
    +          nsPerRow = (timeEnd - timeStart) / N
    +          if (j > 1 && minTime > nsPerRow) minTime = nsPerRow
    +          j += 1
    +        }
    +        minTime
    +      })
    +      printf("%20s %20s %20s %20s\n", 1 << i, results(0), results(1), results(2))
    +      i += 1
    +    }
    +    printf("Unit: ns/row\n")
    +
    +    /*
    +    Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14 on Mac OS X 10.11.5
    +    Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
    +
    +      Num. Distinct Keys      No Fast Hashmap           Vectorized            Row-based
    +                       1                   32                    9                   13
    +                       2                   39                   16                   22
    +                       4                   39                   14                   23
    +                       8                   39                   13                   22
    +                      16                   38                   13                   20
    +                      32                   38                   13                   20
    +                      64                   38                   13                   20
    +                     128                   37                   16                   21
    +                     256                   36                   17                   22
    +                     512                   38                   17                   21
    +                    1024                   39                   18                   21
    +                    2048                   41                   18                   21
    +                    4096                   44                   18                   22
    +                    8192                   49                   20                   23
    +                   16384                   52                   23                   25
    +    Unit: ns/row
    +    */
    +  }
    +
    +  ignore("1 key field, varying value fields, 16 linear distinct keys") {
    +    val N = 20 << 22;
    +
    +    var timeStart: Long = 0L
    +    var timeEnd: Long = 0L
    +    var nsPerRow: Long = 0L
    +    var i = 1
    +    sparkSession.conf.set("spark.sql.codegen.wholeStage", "true")
    +    sparkSession.conf.set("spark.sql.codegen.aggregate.map.columns.max", "30")
    +
    +    // scalastyle:off
    +    println(Benchmark.getJVMOSInfo())
    +    println(Benchmark.getProcessorName())
    +    printf("%20s %20s %20s %20s\n", "Num. Value Fields", "No Fast Hashmap",
    +      "Vectorized", "Row-based")
    +    // scalastyle:on
    +
    +    val modes = List("skip", "vectorized", "rowbased")
    +
    +    while (i < 11) {
    +      val results = modes.map(mode => {
    +        sparkSession.conf.set("spark.sql.codegen.aggregate.map.enforce.impl", mode)
    +        var j = 0
    +        var minTime: Long = 1000
    +        while (j < 5) {
    +          System.gc()
    +          sparkSession.range(N)
    +            .selectExpr("id & " + 15  + " as k0")
    +            .createOrReplaceTempView("test")
    +          timeStart = System.nanoTime
    +          sparkSession.sql("select " + List.range(0, i).map(x => "sum(k" + 0 + ")").mkString(",") +
    +            " from test group by k0").collect()
    +          timeEnd = System.nanoTime
    +          nsPerRow = (timeEnd - timeStart) / N
    +          if (j > 1 && minTime > nsPerRow) minTime = nsPerRow
    +          j += 1
    +        }
    +        minTime
    +      })
    +      printf("%20s %20s %20s %20s\n", i, results(0), results(1), results(2))
    +      i += 1
    +    }
    +    printf("Unit: ns/row\n")
    +
    +    /*
    +    Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14 on Mac OS X 10.11.5
    +    Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
    +
    +       Num. Value Fields      No Fast Hashmap           Vectorized            Row-based
    +                       1                   24                   15                   12
    +                       2                   25                   24                   14
    +                       3                   29                   25                   17
    +                       4                   31                   32                   22
    +                       5                   33                   40                   24
    +                       6                   36                   36                   27
    +                       7                   38                   44                   28
    +                       8                   47                   50                   32
    +                       9                   52                   55                   37
    +                      10                   59                   59                   45
    +    Unit: ns/row
    +    */
    +  }
    +
    +  ignore("varying key fields, 1 value field, 16 linear distinct keys") {
    +    val N = 20 << 22;
    +
    +    var timeStart: Long = 0L
    +    var timeEnd: Long = 0L
    +    var nsPerRow: Long = 0L
    +    var i = 1
    +    sparkSession.conf.set("spark.sql.codegen.wholeStage", "true")
    +    sparkSession.conf.set("spark.sql.codegen.aggregate.map.columns.max", "30")
    +
    +    // scalastyle:off
    +    println(Benchmark.getJVMOSInfo())
    +    println(Benchmark.getProcessorName())
    +    printf("%20s %20s %20s %20s\n", "Num. Key Fields", "No Fast Hashmap",
    +      "Vectorized", "Row-based")
    +    // scalastyle:on
    +
    +    val modes = List("skip", "vectorized", "rowbased")
    +
    +    while (i < 11) {
    +      val results = modes.map(mode => {
    +        sparkSession.conf.set("spark.sql.codegen.aggregate.map.enforce.impl", mode)
    +        var j = 0
    +        var minTime: Long = 1000
    +        while (j < 5) {
    +          System.gc()
    +          val s = "id & " + 15 + " as k"
    +          sparkSession.range(N)
    +            .selectExpr(List.range(0, i).map(x => s + x): _*)
    +            .createOrReplaceTempView("test")
    +          timeStart = System.nanoTime
    +          sparkSession.sql("select sum(k0)" +
    +            " from test group by " + List.range(0, i).map(x => "k" + x).mkString(",")).collect()
    +          timeEnd = System.nanoTime
    +          nsPerRow = (timeEnd - timeStart) / N
    +          if (j > 1 && minTime > nsPerRow) minTime = nsPerRow
    +          j += 1
    +        }
    +        minTime
    +      })
    +      printf("%20s %20s %20s %20s\n", i, results(0), results(1), results(2))
    +      i += 1
    +    }
    +    printf("Unit: ns/row\n")
    +
    +    /*
    +    Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14 on Mac OS X 10.11.5
    +    Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
    +
    +         Num. Key Fields      No Fast Hashmap           Vectorized            Row-based
    +                       1                   24                   15                   13
    +                       2                   31                   20                   14
    +                       3                   37                   22                   17
    +                       4                   46                   26                   18
    +                       5                   53                   27                   20
    +                       6                   61                   29                   23
    +                       7                   69                   36                   25
    +                       8                   78                   37                   27
    +                       9                   88                   43                   30
    +                      10                   92                   45                   33
    +    Unit: ns/row
    +    */
    +  }
    +
    +  ignore("varying key fields, varying value field, 16 linear distinct keys") {
    --- End diff --
    
    The performance difference between column-based and row-based are cache locality, could you increase the number of distinct keys to make sure that not all the keys/values are fit in L1 cache? for example, 4k. We could also increase that to 64k in first two cases (single key, single value).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    **[Test build #62539 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62539/consoleFull)** for PR 14266 at commit [`3b3c9ea`](https://github.com/apache/spark/commit/3b3c9ea6dfc17ba4ebd562c70608be02f22693f9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    **[Test build #63296 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63296/consoleFull)** for PR 14266 at commit [`0875fbc`](https://github.com/apache/spark/commit/0875fbc6013060e67e2b531e2c3c5cccd4cc0e62).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    **[Test build #62541 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62541/consoleFull)** for PR 14266 at commit [`c984914`](https://github.com/apache/spark/commit/c984914224072ea207645d2ce94a229fe5e5b73e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14266: [SPARK-16526][SQL] Benchmarking Performance for Fast Has...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14266
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org