You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by davies <gi...@git.apache.org> on 2015/10/08 02:02:15 UTC
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
GitHub user davies opened a pull request:
https://github.com/apache/spark/pull/9016
[SPARK-10990] [SQL] improve unrolling of complex types
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/davies/spark complex2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/9016.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #9016
----
commit 73eefa2643b70d68a07ce1473d190d9ba996e18a
Author: Davies Liu <da...@databricks.com>
Date: 2015-10-08T00:00:02Z
improve unrolling of complex types
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146371653
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146779983
[Test build #43459 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43459/console) for PR 9016 at commit [`6e050a7`](https://github.com/apache/spark/commit/6e050a7a0f9519e014dfd87342306b49a3fcc384).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `public final class UnsafeRow extends MutableRow implements Externalizable, KryoSerializable `
* `sealed abstract class ColumnType[@specialized(Boolean, Byte, Short, Int, Long) JvmType] `
* ` /** Run a function within Hive state (SessionState, HiveConf, Hive client and class loader) */`
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146737690
[Test build #43441 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43441/console) for PR 9016 at commit [`96661a8`](https://github.com/apache/spark/commit/96661a893a01c195c7eb372aae4660a6ef01c637).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `sealed abstract class ColumnType[@specialized(Boolean, Byte, Short, Int, Long) JvmType] `
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41584197
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
@@ -34,7 +34,8 @@ import org.apache.spark.unsafe.types.UTF8String
*
* @tparam JvmType Underlying Java type to represent the elements.
*/
-private[sql] sealed abstract class ColumnType[JvmType] {
+private[sql]
+sealed abstract class ColumnType[@specialized(Boolean, Byte, Short, Int, Long) JvmType] {
--- End diff --
add float and double too?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146726584
Build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41560296
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/rowFormatConverters.scala ---
@@ -23,6 +23,7 @@ import org.apache.spark.sql.catalyst.InternalRow
import org.apache.spark.sql.catalyst.expressions._
import org.apache.spark.sql.catalyst.plans.physical.Partitioning
import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.columnar.{InMemoryRelation, InMemoryColumnarTableScan}
--- End diff --
unnecessary import?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146713020
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43426/
Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146682930
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146716559
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43429/
Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146780052
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146686989
[Test build #43429 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43429/consoleFull) for PR 9016 at commit [`23e127c`](https://github.com/apache/spark/commit/23e127c2a34ab75a3d6c662d907b1e8f4a0fbde8).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-147579796
[Test build #43597 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43597/console) for PR 9016 at commit [`615d9a3`](https://github.com/apache/spark/commit/615d9a320c04d4ece116da8e652bea82c8af65a2).
* This patch **passes all tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146683971
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-147549291
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146718024
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/9016
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146666160
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-147561041
LGTM pending test.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146716905
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146376857
[Test build #43362 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43362/consoleFull) for PR 9016 at commit [`73eefa2`](https://github.com/apache/spark/commit/73eefa2643b70d68a07ce1473d190d9ba996e18a).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146669507
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146756468
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146716512
[Test build #43429 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43429/console) for PR 9016 at commit [`23e127c`](https://github.com/apache/spark/commit/23e127c2a34ab75a3d6c662d907b1e8f4a0fbde8).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146743462
[Test build #43448 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43448/console) for PR 9016 at commit [`b29314b`](https://github.com/apache/spark/commit/b29314b180d475b53adbc4fcd696bfe061b6ae12).
* This patch **fails Spark unit tests**.
* This patch **does not merge cleanly**.
* This patch adds the following public classes _(experimental)_:
* `sealed abstract class ColumnType[@specialized(Boolean, Byte, Short, Int, Long) JvmType] `
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146683991
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-147597344
Thanks, merging to master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146951360
[Test build #43476 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43476/console) for PR 9016 at commit [`1716bcd`](https://github.com/apache/spark/commit/1716bcd8cddc28b5b2ab34c0dc8bb45bbbc31410).
* This patch **passes all tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `sealed abstract class ColumnType[@specialized(Boolean, Byte, Short, Int, Long) JvmType] `
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-147545154
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146383989
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146718346
[Test build #43441 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43441/consoleFull) for PR 9016 at commit [`96661a8`](https://github.com/apache/spark/commit/96661a893a01c195c7eb372aae4660a6ef01c637).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-147545132
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146743576
Build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41474659
--- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeMapData.java ---
@@ -65,6 +65,22 @@ public UnsafeArrayData valueArray() {
}
@Override
+ public int hashCode() {
+ int h = numElements;
+ return (h * 31 + keys.hashCode()) * 31 + values.hashCode();
+ }
+
+ @Override
+ public boolean equals(Object obj) {
+ if (obj instanceof UnsafeMapData) {
+ UnsafeMapData map = (UnsafeMapData) obj;
+ return numElements == map.numElements && keys.equals(map.keyArray())
+ && values.equals(map.valueArray());
--- End diff --
how about elements orders? `Map(1 -> "a", 2 -> "b")` should be equal to `Map(2 -> "b", 1 -> "a")`
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146727257
[Test build #43448 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43448/consoleFull) for PR 9016 at commit [`b29314b`](https://github.com/apache/spark/commit/b29314b180d475b53adbc4fcd696bfe061b6ae12).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146732511
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43439/
Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146714538
[Test build #43439 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43439/consoleFull) for PR 9016 at commit [`297b06e`](https://github.com/apache/spark/commit/297b06ef1468a658ae68dc4e6884d65c915fb628).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41788873
--- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeArrayData.java ---
@@ -145,6 +146,8 @@ public Object get(int ordinal, DataType dataType) {
return getArray(ordinal);
} else if (dataType instanceof MapType) {
return getMap(ordinal);
+ } else if (dataType instanceof UserDefinedType) {
+ return get(ordinal, ((UserDefinedType)dataType).sqlType());
--- End diff --
Nit: no `()` after `sqlType`
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146951525
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41802694
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
@@ -34,7 +34,8 @@ import org.apache.spark.unsafe.types.UTF8String
*
* @tparam JvmType Underlying Java type to represent the elements.
*/
-private[sql] sealed abstract class ColumnType[JvmType] {
+private[sql]
+sealed abstract class ColumnType[@specialized(Boolean, Byte, Short, Int, Long) JvmType] {
--- End diff --
Haven't checking the JITed code, not sure the generated code will look like.
@rxin Will this work as expected (having specialized class for these primitive types)?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146713018
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-147549299
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41806391
--- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeArrayData.java ---
@@ -145,6 +146,8 @@ public Object get(int ordinal, DataType dataType) {
return getArray(ordinal);
} else if (dataType instanceof MapType) {
return getMap(ordinal);
+ } else if (dataType instanceof UserDefinedType) {
+ return get(ordinal, ((UserDefinedType)dataType).sqlType());
--- End diff --
Oh yeah, sorry for that.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41806519
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
@@ -34,7 +34,8 @@ import org.apache.spark.unsafe.types.UTF8String
*
* @tparam JvmType Underlying Java type to represent the elements.
*/
-private[sql] sealed abstract class ColumnType[JvmType] {
+private[sql]
+sealed abstract class ColumnType[@specialized(Boolean, Byte, Short, Int, Long) JvmType] {
--- End diff --
We only need to check .class file compiled by the Scala compile. There's no need to check JIT-ed code.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41802164
--- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeArrayData.java ---
@@ -145,6 +146,8 @@ public Object get(int ordinal, DataType dataType) {
return getArray(ordinal);
} else if (dataType instanceof MapType) {
return getMap(ordinal);
+ } else if (dataType instanceof UserDefinedType) {
+ return get(ordinal, ((UserDefinedType)dataType).sqlType());
--- End diff --
This is Java.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146732468
[Test build #43439 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43439/console) for PR 9016 at commit [`297b06e`](https://github.com/apache/spark/commit/297b06ef1468a658ae68dc4e6884d65c915fb628).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `case class BinaryHashJoinNode(`
* `case class BroadcastHashJoinNode(`
* `trait HashJoinNode `
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-147575381
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43596/
Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41802119
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
@@ -449,124 +416,126 @@ private[sql] object LARGE_DECIMAL {
}
}
-private[sql] case class STRUCT(dataType: StructType)
- extends ByteArrayColumnType[InternalRow](20) {
+private[sql] case class STRUCT(dataType: StructType) extends ColumnType[UnsafeRow] {
- private val projection: UnsafeProjection =
- UnsafeProjection.create(dataType)
private val numOfFields: Int = dataType.fields.size
- override def setField(row: MutableRow, ordinal: Int, value: InternalRow): Unit = {
+ override def defaultSize: Int = 20
+
+ override def setField(row: MutableRow, ordinal: Int, value: UnsafeRow): Unit = {
row.update(ordinal, value)
}
- override def getField(row: InternalRow, ordinal: Int): InternalRow = {
- row.getStruct(ordinal, numOfFields)
+ override def getField(row: InternalRow, ordinal: Int): UnsafeRow = {
+ row.getStruct(ordinal, numOfFields).asInstanceOf[UnsafeRow]
}
- override def serialize(value: InternalRow): Array[Byte] = {
- val unsafeRow = if (value.isInstanceOf[UnsafeRow]) {
- value.asInstanceOf[UnsafeRow]
- } else {
- projection(value)
- }
- unsafeRow.getBytes
+ override def actualSize(row: InternalRow, ordinal: Int): Int = {
+ 4 + getField(row, ordinal).getSizeInBytes
+ }
+
+ override def append(value: UnsafeRow, buffer: ByteBuffer): Unit = {
+ buffer.putInt(value.getSizeInBytes)
+ value.writeTo(buffer)
}
- override def deserialize(bytes: Array[Byte]): InternalRow = {
+ override def extract(buffer: ByteBuffer): UnsafeRow = {
+ val sizeInBytes = buffer.getInt()
+ assert(buffer.hasArray)
+ val base = buffer.array()
+ val offset = buffer.arrayOffset()
+ val cursor = buffer.position()
+ buffer.position(cursor + sizeInBytes)
val unsafeRow = new UnsafeRow
- unsafeRow.pointTo(bytes, numOfFields, bytes.length)
+ unsafeRow.pointTo(base, Platform.BYTE_ARRAY_OFFSET + offset + cursor, numOfFields, sizeInBytes)
unsafeRow
}
- override def clone(v: InternalRow): InternalRow = v.copy()
+ override def clone(v: UnsafeRow): UnsafeRow = v.copy()
}
-private[sql] case class ARRAY(dataType: ArrayType)
- extends ByteArrayColumnType[ArrayData](16) {
+private[sql] case class ARRAY(dataType: ArrayType) extends ColumnType[UnsafeArrayData] {
- private lazy val projection = UnsafeProjection.create(Array[DataType](dataType))
- private val mutableRow = new GenericMutableRow(new Array[Any](1))
+ override def defaultSize: Int = 16
- override def setField(row: MutableRow, ordinal: Int, value: ArrayData): Unit = {
+ override def setField(row: MutableRow, ordinal: Int, value: UnsafeArrayData): Unit = {
row.update(ordinal, value)
}
- override def getField(row: InternalRow, ordinal: Int): ArrayData = {
- row.getArray(ordinal)
+ override def getField(row: InternalRow, ordinal: Int): UnsafeArrayData = {
+ row.getArray(ordinal).asInstanceOf[UnsafeArrayData]
}
- override def serialize(value: ArrayData): Array[Byte] = {
- val unsafeArray = if (value.isInstanceOf[UnsafeArrayData]) {
- value.asInstanceOf[UnsafeArrayData]
- } else {
- mutableRow(0) = value
- projection(mutableRow).getArray(0)
- }
- val outputBuffer =
- ByteBuffer.allocate(4 + unsafeArray.getSizeInBytes).order(ByteOrder.nativeOrder())
- outputBuffer.putInt(unsafeArray.numElements())
- val underlying = outputBuffer.array()
- unsafeArray.writeToMemory(underlying, Platform.BYTE_ARRAY_OFFSET + 4)
- underlying
+ override def actualSize(row: InternalRow, ordinal: Int): Int = {
+ val unsafeArray = getField(row, ordinal)
+ 4 + 4 + unsafeArray.getSizeInBytes
}
- override def deserialize(bytes: Array[Byte]): ArrayData = {
- val buffer = ByteBuffer.wrap(bytes).order(ByteOrder.nativeOrder())
+ override def append(value: UnsafeArrayData, buffer: ByteBuffer): Unit = {
+ buffer.putInt(value.numElements())
+ buffer.putInt(value.getSizeInBytes)
+ value.writeTo(buffer)
--- End diff --
We could consolidate the format for array/map later.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-147546562
[Test build #43596 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43596/consoleFull) for PR 9016 at commit [`55a92ba`](https://github.com/apache/spark/commit/55a92ba9be5afd3a20a563fd819b2d99e0512114).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-147549002
@cloud-fan @liancheng Had updated this to use UnsafeReader, do you have more comments?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146718027
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43440/
Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146756479
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-147549448
LGTM pending Jenkins
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146905428
[Test build #43476 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43476/consoleFull) for PR 9016 at commit [`1716bcd`](https://github.com/apache/spark/commit/1716bcd8cddc28b5b2ab34c0dc8bb45bbbc31410).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146715628
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41788877
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
@@ -34,7 +34,8 @@ import org.apache.spark.unsafe.types.UTF8String
*
* @tparam JvmType Underlying Java type to represent the elements.
*/
-private[sql] sealed abstract class ColumnType[JvmType] {
+private[sql]
+sealed abstract class ColumnType[@specialized(Boolean, Byte, Short, Int, Long) JvmType] {
--- End diff --
Have we checked the compiled byte code to confirm this does help eliminating boxing?
@cloud-fan For `Float` and `Double`, had offline discussion with @davies. It's because this specialization is for simplifying `ColumnType.copyField`, which is only used in RLE encoder, and RLE encoder doesn't support `Float` or `Double` columns.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146712971
[Test build #43426 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43426/console) for PR 9016 at commit [`06161e3`](https://github.com/apache/spark/commit/06161e374ce7f97055767aea24b8b3ec4edfb5cb).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146666193
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41804973
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
@@ -449,124 +416,126 @@ private[sql] object LARGE_DECIMAL {
}
}
-private[sql] case class STRUCT(dataType: StructType)
- extends ByteArrayColumnType[InternalRow](20) {
+private[sql] case class STRUCT(dataType: StructType) extends ColumnType[UnsafeRow] {
- private val projection: UnsafeProjection =
- UnsafeProjection.create(dataType)
private val numOfFields: Int = dataType.fields.size
- override def setField(row: MutableRow, ordinal: Int, value: InternalRow): Unit = {
+ override def defaultSize: Int = 20
+
+ override def setField(row: MutableRow, ordinal: Int, value: UnsafeRow): Unit = {
row.update(ordinal, value)
}
- override def getField(row: InternalRow, ordinal: Int): InternalRow = {
- row.getStruct(ordinal, numOfFields)
+ override def getField(row: InternalRow, ordinal: Int): UnsafeRow = {
+ row.getStruct(ordinal, numOfFields).asInstanceOf[UnsafeRow]
}
- override def serialize(value: InternalRow): Array[Byte] = {
- val unsafeRow = if (value.isInstanceOf[UnsafeRow]) {
- value.asInstanceOf[UnsafeRow]
- } else {
- projection(value)
- }
- unsafeRow.getBytes
+ override def actualSize(row: InternalRow, ordinal: Int): Int = {
+ 4 + getField(row, ordinal).getSizeInBytes
+ }
+
+ override def append(value: UnsafeRow, buffer: ByteBuffer): Unit = {
+ buffer.putInt(value.getSizeInBytes)
+ value.writeTo(buffer)
}
- override def deserialize(bytes: Array[Byte]): InternalRow = {
+ override def extract(buffer: ByteBuffer): UnsafeRow = {
+ val sizeInBytes = buffer.getInt()
+ assert(buffer.hasArray)
+ val base = buffer.array()
+ val offset = buffer.arrayOffset()
+ val cursor = buffer.position()
+ buffer.position(cursor + sizeInBytes)
val unsafeRow = new UnsafeRow
- unsafeRow.pointTo(bytes, numOfFields, bytes.length)
+ unsafeRow.pointTo(base, Platform.BYTE_ARRAY_OFFSET + offset + cursor, numOfFields, sizeInBytes)
unsafeRow
}
- override def clone(v: InternalRow): InternalRow = v.copy()
+ override def clone(v: UnsafeRow): UnsafeRow = v.copy()
}
-private[sql] case class ARRAY(dataType: ArrayType)
- extends ByteArrayColumnType[ArrayData](16) {
+private[sql] case class ARRAY(dataType: ArrayType) extends ColumnType[UnsafeArrayData] {
- private lazy val projection = UnsafeProjection.create(Array[DataType](dataType))
- private val mutableRow = new GenericMutableRow(new Array[Any](1))
+ override def defaultSize: Int = 16
- override def setField(row: MutableRow, ordinal: Int, value: ArrayData): Unit = {
+ override def setField(row: MutableRow, ordinal: Int, value: UnsafeArrayData): Unit = {
row.update(ordinal, value)
}
- override def getField(row: InternalRow, ordinal: Int): ArrayData = {
- row.getArray(ordinal)
+ override def getField(row: InternalRow, ordinal: Int): UnsafeArrayData = {
+ row.getArray(ordinal).asInstanceOf[UnsafeArrayData]
}
- override def serialize(value: ArrayData): Array[Byte] = {
- val unsafeArray = if (value.isInstanceOf[UnsafeArrayData]) {
- value.asInstanceOf[UnsafeArrayData]
- } else {
- mutableRow(0) = value
- projection(mutableRow).getArray(0)
- }
- val outputBuffer =
- ByteBuffer.allocate(4 + unsafeArray.getSizeInBytes).order(ByteOrder.nativeOrder())
- outputBuffer.putInt(unsafeArray.numElements())
- val underlying = outputBuffer.array()
- unsafeArray.writeToMemory(underlying, Platform.BYTE_ARRAY_OFFSET + 4)
- underlying
+ override def actualSize(row: InternalRow, ordinal: Int): Int = {
+ val unsafeArray = getField(row, ordinal)
+ 4 + 4 + unsafeArray.getSizeInBytes
}
- override def deserialize(bytes: Array[Byte]): ArrayData = {
- val buffer = ByteBuffer.wrap(bytes).order(ByteOrder.nativeOrder())
+ override def append(value: UnsafeArrayData, buffer: ByteBuffer): Unit = {
+ buffer.putInt(value.numElements())
+ buffer.putInt(value.getSizeInBytes)
+ value.writeTo(buffer)
+ }
+
+ override def extract(buffer: ByteBuffer): UnsafeArrayData = {
val numElements = buffer.getInt
+ val sizeInBytes = buffer.getInt
+ assert(buffer.hasArray)
+ val base = buffer.array()
+ val offset = buffer.arrayOffset()
+ val cursor = buffer.position()
+ buffer.position(cursor + sizeInBytes)
val array = new UnsafeArrayData
- array.pointTo(bytes, Platform.BYTE_ARRAY_OFFSET + 4, numElements, bytes.length - 4)
+ array.pointTo(base, Platform.BYTE_ARRAY_OFFSET + offset + cursor, numElements, sizeInBytes)
array
}
- override def clone(v: ArrayData): ArrayData = v.copy()
+ override def clone(v: UnsafeArrayData): UnsafeArrayData = v.copy()
}
-private[sql] case class MAP(dataType: MapType) extends ByteArrayColumnType[MapData](32) {
+private[sql] case class MAP(dataType: MapType) extends ColumnType[UnsafeMapData] {
- private lazy val projection: UnsafeProjection = UnsafeProjection.create(Array[DataType](dataType))
- private val mutableRow = new GenericMutableRow(new Array[Any](1))
+ override def defaultSize: Int = 32
- override def setField(row: MutableRow, ordinal: Int, value: MapData): Unit = {
+ override def setField(row: MutableRow, ordinal: Int, value: UnsafeMapData): Unit = {
row.update(ordinal, value)
}
- override def getField(row: InternalRow, ordinal: Int): MapData = {
- row.getMap(ordinal)
+ override def getField(row: InternalRow, ordinal: Int): UnsafeMapData = {
+ row.getMap(ordinal).asInstanceOf[UnsafeMapData]
}
- override def serialize(value: MapData): Array[Byte] = {
- val unsafeMap = if (value.isInstanceOf[UnsafeMapData]) {
- value.asInstanceOf[UnsafeMapData]
- } else {
- mutableRow(0) = value
- projection(mutableRow).getMap(0)
- }
+ override def actualSize(row: InternalRow, ordinal: Int): Int = {
+ val unsafeMap = getField(row, ordinal)
+ 12 + unsafeMap.keyArray().getSizeInBytes + unsafeMap.valueArray().getSizeInBytes
+ }
- val outputBuffer =
- ByteBuffer.allocate(8 + unsafeMap.getSizeInBytes).order(ByteOrder.nativeOrder())
- outputBuffer.putInt(unsafeMap.numElements())
- val keyBytes = unsafeMap.keyArray().getSizeInBytes
- outputBuffer.putInt(keyBytes)
- val underlying = outputBuffer.array()
- unsafeMap.keyArray().writeToMemory(underlying, Platform.BYTE_ARRAY_OFFSET + 8)
- unsafeMap.valueArray().writeToMemory(underlying, Platform.BYTE_ARRAY_OFFSET + 8 + keyBytes)
- underlying
+ override def append(value: UnsafeMapData, buffer: ByteBuffer): Unit = {
+ buffer.putInt(value.numElements())
+ buffer.putInt(value.keyArray().getSizeInBytes)
+ buffer.putInt(value.valueArray().getSizeInBytes)
+ value.keyArray().writeTo(buffer)
+ value.valueArray().writeTo(buffer)
}
- override def deserialize(bytes: Array[Byte]): MapData = {
- val buffer = ByteBuffer.wrap(bytes).order(ByteOrder.nativeOrder())
+ override def extract(buffer: ByteBuffer): UnsafeMapData = {
val numElements = buffer.getInt
val keyArraySize = buffer.getInt
+ val valueArraySize = buffer.getInt
+ assert(buffer.hasArray)
+ val base = buffer.array()
+ val offset = buffer.arrayOffset()
+ val cursor = buffer.position()
val keyArray = new UnsafeArrayData
+ keyArray.pointTo(base, Platform.BYTE_ARRAY_OFFSET + offset + cursor, numElements, keyArraySize)
val valueArray = new UnsafeArrayData
- keyArray.pointTo(bytes, Platform.BYTE_ARRAY_OFFSET + 8, numElements, keyArraySize)
- valueArray.pointTo(bytes, Platform.BYTE_ARRAY_OFFSET + 8 + keyArraySize, numElements,
- bytes.length - 8 - keyArraySize)
+ valueArray.pointTo(base, Platform.BYTE_ARRAY_OFFSET + offset + cursor + keyArraySize,
+ numElements, valueArraySize)
+ buffer.position(cursor + keyArraySize + valueArraySize)
new UnsafeMapData(keyArray, valueArray)
--- End diff --
+1
These two parts are mostly duplicated.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41679715
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
@@ -449,124 +416,126 @@ private[sql] object LARGE_DECIMAL {
}
}
-private[sql] case class STRUCT(dataType: StructType)
- extends ByteArrayColumnType[InternalRow](20) {
+private[sql] case class STRUCT(dataType: StructType) extends ColumnType[UnsafeRow] {
- private val projection: UnsafeProjection =
- UnsafeProjection.create(dataType)
private val numOfFields: Int = dataType.fields.size
- override def setField(row: MutableRow, ordinal: Int, value: InternalRow): Unit = {
+ override def defaultSize: Int = 20
+
+ override def setField(row: MutableRow, ordinal: Int, value: UnsafeRow): Unit = {
row.update(ordinal, value)
}
- override def getField(row: InternalRow, ordinal: Int): InternalRow = {
- row.getStruct(ordinal, numOfFields)
+ override def getField(row: InternalRow, ordinal: Int): UnsafeRow = {
+ row.getStruct(ordinal, numOfFields).asInstanceOf[UnsafeRow]
}
- override def serialize(value: InternalRow): Array[Byte] = {
- val unsafeRow = if (value.isInstanceOf[UnsafeRow]) {
- value.asInstanceOf[UnsafeRow]
- } else {
- projection(value)
- }
- unsafeRow.getBytes
+ override def actualSize(row: InternalRow, ordinal: Int): Int = {
+ 4 + getField(row, ordinal).getSizeInBytes
+ }
+
+ override def append(value: UnsafeRow, buffer: ByteBuffer): Unit = {
+ buffer.putInt(value.getSizeInBytes)
+ value.writeTo(buffer)
}
- override def deserialize(bytes: Array[Byte]): InternalRow = {
+ override def extract(buffer: ByteBuffer): UnsafeRow = {
+ val sizeInBytes = buffer.getInt()
+ assert(buffer.hasArray)
+ val base = buffer.array()
+ val offset = buffer.arrayOffset()
+ val cursor = buffer.position()
+ buffer.position(cursor + sizeInBytes)
val unsafeRow = new UnsafeRow
- unsafeRow.pointTo(bytes, numOfFields, bytes.length)
+ unsafeRow.pointTo(base, Platform.BYTE_ARRAY_OFFSET + offset + cursor, numOfFields, sizeInBytes)
unsafeRow
}
- override def clone(v: InternalRow): InternalRow = v.copy()
+ override def clone(v: UnsafeRow): UnsafeRow = v.copy()
}
-private[sql] case class ARRAY(dataType: ArrayType)
- extends ByteArrayColumnType[ArrayData](16) {
+private[sql] case class ARRAY(dataType: ArrayType) extends ColumnType[UnsafeArrayData] {
- private lazy val projection = UnsafeProjection.create(Array[DataType](dataType))
- private val mutableRow = new GenericMutableRow(new Array[Any](1))
+ override def defaultSize: Int = 16
- override def setField(row: MutableRow, ordinal: Int, value: ArrayData): Unit = {
+ override def setField(row: MutableRow, ordinal: Int, value: UnsafeArrayData): Unit = {
row.update(ordinal, value)
}
- override def getField(row: InternalRow, ordinal: Int): ArrayData = {
- row.getArray(ordinal)
+ override def getField(row: InternalRow, ordinal: Int): UnsafeArrayData = {
+ row.getArray(ordinal).asInstanceOf[UnsafeArrayData]
}
- override def serialize(value: ArrayData): Array[Byte] = {
- val unsafeArray = if (value.isInstanceOf[UnsafeArrayData]) {
- value.asInstanceOf[UnsafeArrayData]
- } else {
- mutableRow(0) = value
- projection(mutableRow).getArray(0)
- }
- val outputBuffer =
- ByteBuffer.allocate(4 + unsafeArray.getSizeInBytes).order(ByteOrder.nativeOrder())
- outputBuffer.putInt(unsafeArray.numElements())
- val underlying = outputBuffer.array()
- unsafeArray.writeToMemory(underlying, Platform.BYTE_ARRAY_OFFSET + 4)
- underlying
+ override def actualSize(row: InternalRow, ordinal: Int): Int = {
+ val unsafeArray = getField(row, ordinal)
+ 4 + 4 + unsafeArray.getSizeInBytes
}
- override def deserialize(bytes: Array[Byte]): ArrayData = {
- val buffer = ByteBuffer.wrap(bytes).order(ByteOrder.nativeOrder())
+ override def append(value: UnsafeArrayData, buffer: ByteBuffer): Unit = {
+ buffer.putInt(value.numElements())
+ buffer.putInt(value.getSizeInBytes)
+ value.writeTo(buffer)
+ }
+
+ override def extract(buffer: ByteBuffer): UnsafeArrayData = {
val numElements = buffer.getInt
+ val sizeInBytes = buffer.getInt
+ assert(buffer.hasArray)
+ val base = buffer.array()
+ val offset = buffer.arrayOffset()
+ val cursor = buffer.position()
+ buffer.position(cursor + sizeInBytes)
val array = new UnsafeArrayData
- array.pointTo(bytes, Platform.BYTE_ARRAY_OFFSET + 4, numElements, bytes.length - 4)
+ array.pointTo(base, Platform.BYTE_ARRAY_OFFSET + offset + cursor, numElements, sizeInBytes)
array
}
- override def clone(v: ArrayData): ArrayData = v.copy()
+ override def clone(v: UnsafeArrayData): UnsafeArrayData = v.copy()
}
-private[sql] case class MAP(dataType: MapType) extends ByteArrayColumnType[MapData](32) {
+private[sql] case class MAP(dataType: MapType) extends ColumnType[UnsafeMapData] {
- private lazy val projection: UnsafeProjection = UnsafeProjection.create(Array[DataType](dataType))
- private val mutableRow = new GenericMutableRow(new Array[Any](1))
+ override def defaultSize: Int = 32
- override def setField(row: MutableRow, ordinal: Int, value: MapData): Unit = {
+ override def setField(row: MutableRow, ordinal: Int, value: UnsafeMapData): Unit = {
row.update(ordinal, value)
}
- override def getField(row: InternalRow, ordinal: Int): MapData = {
- row.getMap(ordinal)
+ override def getField(row: InternalRow, ordinal: Int): UnsafeMapData = {
+ row.getMap(ordinal).asInstanceOf[UnsafeMapData]
}
- override def serialize(value: MapData): Array[Byte] = {
- val unsafeMap = if (value.isInstanceOf[UnsafeMapData]) {
- value.asInstanceOf[UnsafeMapData]
- } else {
- mutableRow(0) = value
- projection(mutableRow).getMap(0)
- }
+ override def actualSize(row: InternalRow, ordinal: Int): Int = {
+ val unsafeMap = getField(row, ordinal)
+ 12 + unsafeMap.keyArray().getSizeInBytes + unsafeMap.valueArray().getSizeInBytes
+ }
- val outputBuffer =
- ByteBuffer.allocate(8 + unsafeMap.getSizeInBytes).order(ByteOrder.nativeOrder())
- outputBuffer.putInt(unsafeMap.numElements())
- val keyBytes = unsafeMap.keyArray().getSizeInBytes
- outputBuffer.putInt(keyBytes)
- val underlying = outputBuffer.array()
- unsafeMap.keyArray().writeToMemory(underlying, Platform.BYTE_ARRAY_OFFSET + 8)
- unsafeMap.valueArray().writeToMemory(underlying, Platform.BYTE_ARRAY_OFFSET + 8 + keyBytes)
- underlying
+ override def append(value: UnsafeMapData, buffer: ByteBuffer): Unit = {
+ buffer.putInt(value.numElements())
+ buffer.putInt(value.keyArray().getSizeInBytes)
+ buffer.putInt(value.valueArray().getSizeInBytes)
+ value.keyArray().writeTo(buffer)
+ value.valueArray().writeTo(buffer)
}
- override def deserialize(bytes: Array[Byte]): MapData = {
- val buffer = ByteBuffer.wrap(bytes).order(ByteOrder.nativeOrder())
+ override def extract(buffer: ByteBuffer): UnsafeMapData = {
val numElements = buffer.getInt
val keyArraySize = buffer.getInt
+ val valueArraySize = buffer.getInt
+ assert(buffer.hasArray)
+ val base = buffer.array()
+ val offset = buffer.arrayOffset()
+ val cursor = buffer.position()
val keyArray = new UnsafeArrayData
+ keyArray.pointTo(base, Platform.BYTE_ARRAY_OFFSET + offset + cursor, numElements, keyArraySize)
val valueArray = new UnsafeArrayData
- keyArray.pointTo(bytes, Platform.BYTE_ARRAY_OFFSET + 8, numElements, keyArraySize)
- valueArray.pointTo(bytes, Platform.BYTE_ARRAY_OFFSET + 8 + keyArraySize, numElements,
- bytes.length - 8 - keyArraySize)
+ valueArray.pointTo(base, Platform.BYTE_ARRAY_OFFSET + offset + cursor + keyArraySize,
+ numElements, valueArraySize)
+ buffer.position(cursor + keyArraySize + valueArraySize)
new UnsafeMapData(keyArray, valueArray)
--- End diff --
I think we can use `UnsafeReaders.readMap` here.
Same to the array part.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146683593
[Test build #43426 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43426/consoleFull) for PR 9016 at commit [`06161e3`](https://github.com/apache/spark/commit/06161e374ce7f97055767aea24b8b3ec4edfb5cb).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146715641
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146951528
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43476/
Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41570679
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
@@ -324,15 +296,18 @@ private[sql] object STRING extends NativeColumnType(StringType, 8) {
}
override def append(v: UTF8String, buffer: ByteBuffer): Unit = {
- val stringBytes = v.getBytes
- buffer.putInt(stringBytes.length).put(stringBytes, 0, stringBytes.length)
+ buffer.putInt(v.numBytes())
+ v.writeTo(buffer)
}
override def extract(buffer: ByteBuffer): UTF8String = {
val length = buffer.getInt()
- val stringBytes = new Array[Byte](length)
- buffer.get(stringBytes, 0, length)
- UTF8String.fromBytes(stringBytes)
+ assert(buffer.hasArray)
--- End diff --
ByteBuffer is used for getInt getLong, we could do the same without changing the interface.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41679710
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
@@ -449,124 +416,126 @@ private[sql] object LARGE_DECIMAL {
}
}
-private[sql] case class STRUCT(dataType: StructType)
- extends ByteArrayColumnType[InternalRow](20) {
+private[sql] case class STRUCT(dataType: StructType) extends ColumnType[UnsafeRow] {
- private val projection: UnsafeProjection =
- UnsafeProjection.create(dataType)
private val numOfFields: Int = dataType.fields.size
- override def setField(row: MutableRow, ordinal: Int, value: InternalRow): Unit = {
+ override def defaultSize: Int = 20
+
+ override def setField(row: MutableRow, ordinal: Int, value: UnsafeRow): Unit = {
row.update(ordinal, value)
}
- override def getField(row: InternalRow, ordinal: Int): InternalRow = {
- row.getStruct(ordinal, numOfFields)
+ override def getField(row: InternalRow, ordinal: Int): UnsafeRow = {
+ row.getStruct(ordinal, numOfFields).asInstanceOf[UnsafeRow]
}
- override def serialize(value: InternalRow): Array[Byte] = {
- val unsafeRow = if (value.isInstanceOf[UnsafeRow]) {
- value.asInstanceOf[UnsafeRow]
- } else {
- projection(value)
- }
- unsafeRow.getBytes
+ override def actualSize(row: InternalRow, ordinal: Int): Int = {
+ 4 + getField(row, ordinal).getSizeInBytes
+ }
+
+ override def append(value: UnsafeRow, buffer: ByteBuffer): Unit = {
+ buffer.putInt(value.getSizeInBytes)
+ value.writeTo(buffer)
}
- override def deserialize(bytes: Array[Byte]): InternalRow = {
+ override def extract(buffer: ByteBuffer): UnsafeRow = {
+ val sizeInBytes = buffer.getInt()
+ assert(buffer.hasArray)
+ val base = buffer.array()
+ val offset = buffer.arrayOffset()
+ val cursor = buffer.position()
+ buffer.position(cursor + sizeInBytes)
val unsafeRow = new UnsafeRow
- unsafeRow.pointTo(bytes, numOfFields, bytes.length)
+ unsafeRow.pointTo(base, Platform.BYTE_ARRAY_OFFSET + offset + cursor, numOfFields, sizeInBytes)
unsafeRow
}
- override def clone(v: InternalRow): InternalRow = v.copy()
+ override def clone(v: UnsafeRow): UnsafeRow = v.copy()
}
-private[sql] case class ARRAY(dataType: ArrayType)
- extends ByteArrayColumnType[ArrayData](16) {
+private[sql] case class ARRAY(dataType: ArrayType) extends ColumnType[UnsafeArrayData] {
- private lazy val projection = UnsafeProjection.create(Array[DataType](dataType))
- private val mutableRow = new GenericMutableRow(new Array[Any](1))
+ override def defaultSize: Int = 16
- override def setField(row: MutableRow, ordinal: Int, value: ArrayData): Unit = {
+ override def setField(row: MutableRow, ordinal: Int, value: UnsafeArrayData): Unit = {
row.update(ordinal, value)
}
- override def getField(row: InternalRow, ordinal: Int): ArrayData = {
- row.getArray(ordinal)
+ override def getField(row: InternalRow, ordinal: Int): UnsafeArrayData = {
+ row.getArray(ordinal).asInstanceOf[UnsafeArrayData]
}
- override def serialize(value: ArrayData): Array[Byte] = {
- val unsafeArray = if (value.isInstanceOf[UnsafeArrayData]) {
- value.asInstanceOf[UnsafeArrayData]
- } else {
- mutableRow(0) = value
- projection(mutableRow).getArray(0)
- }
- val outputBuffer =
- ByteBuffer.allocate(4 + unsafeArray.getSizeInBytes).order(ByteOrder.nativeOrder())
- outputBuffer.putInt(unsafeArray.numElements())
- val underlying = outputBuffer.array()
- unsafeArray.writeToMemory(underlying, Platform.BYTE_ARRAY_OFFSET + 4)
- underlying
+ override def actualSize(row: InternalRow, ordinal: Int): Int = {
+ val unsafeArray = getField(row, ordinal)
+ 4 + 4 + unsafeArray.getSizeInBytes
}
- override def deserialize(bytes: Array[Byte]): ArrayData = {
- val buffer = ByteBuffer.wrap(bytes).order(ByteOrder.nativeOrder())
+ override def append(value: UnsafeArrayData, buffer: ByteBuffer): Unit = {
+ buffer.putInt(value.numElements())
+ buffer.putInt(value.getSizeInBytes)
+ value.writeTo(buffer)
--- End diff --
I think we should write sizeInBytes first, then numElements, then bytes.
numElements + bytes is kind of a standard of writing unsafe array, we should follow the description in unsafe array doc.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146714405
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146756987
[Test build #43459 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43459/consoleFull) for PR 9016 at commit [`6e050a7`](https://github.com/apache/spark/commit/6e050a7a0f9519e014dfd87342306b49a3fcc384).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146716558
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146726579
Build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146780055
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43459/
Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-147575379
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-147579896
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43597/
Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146669206
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-147550833
[Test build #43597 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43597/consoleFull) for PR 9016 at commit [`615d9a3`](https://github.com/apache/spark/commit/615d9a320c04d4ece116da8e652bea82c8af65a2).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146732510
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41808695
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
@@ -34,7 +34,8 @@ import org.apache.spark.unsafe.types.UTF8String
*
* @tparam JvmType Underlying Java type to represent the elements.
*/
-private[sql] sealed abstract class ColumnType[JvmType] {
+private[sql]
+sealed abstract class ColumnType[@specialized(Boolean, Byte, Short, Int, Long) JvmType] {
--- End diff --
Thanks, will rollback these changes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146383852
[Test build #43362 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43362/console) for PR 9016 at commit [`73eefa2`](https://github.com/apache/spark/commit/73eefa2643b70d68a07ce1473d190d9ba996e18a).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146737729
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43441/
Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146737728
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146904577
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41808778
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
@@ -34,7 +34,8 @@ import org.apache.spark.unsafe.types.UTF8String
*
* @tparam JvmType Underlying Java type to represent the elements.
*/
-private[sql] sealed abstract class ColumnType[JvmType] {
+private[sql]
+sealed abstract class ColumnType[@specialized(Boolean, Byte, Short, Int, Long) JvmType] {
--- End diff --
Checked the with `javap`, and it turned out that `ColumnType.copyField` always calls the boxed version of `setField` and `getField`. I think we should revert this part of changes since it's somewhat tricky to get optimal byte code with `@specialized`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146743581
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43448/
Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146371664
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-147575290
[Test build #43596 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43596/console) for PR 9016 at commit [`55a92ba`](https://github.com/apache/spark/commit/55a92ba9be5afd3a20a563fd819b2d99e0512114).
* This patch **passes all tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146669508
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43419/
Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146383991
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43362/
Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146714428
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146716893
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146904608
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9016#issuecomment-146682889
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41569273
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
@@ -324,15 +296,18 @@ private[sql] object STRING extends NativeColumnType(StringType, 8) {
}
override def append(v: UTF8String, buffer: ByteBuffer): Unit = {
- val stringBytes = v.getBytes
- buffer.putInt(stringBytes.length).put(stringBytes, 0, stringBytes.length)
+ buffer.putInt(v.numBytes())
+ v.writeTo(buffer)
}
override def extract(buffer: ByteBuffer): UTF8String = {
val length = buffer.getInt()
- val stringBytes = new Array[Byte](length)
- buffer.get(stringBytes, 0, length)
- UTF8String.fromBytes(stringBytes)
+ assert(buffer.hasArray)
--- End diff --
How about we just use a byte array and a int cursor to keep data in columnar cache instead of `ByteBuffer`?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-10990] [SQL] improve unrolling of compl...
Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/9016#discussion_r41554791
--- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeMapData.java ---
@@ -65,6 +65,22 @@ public UnsafeArrayData valueArray() {
}
@Override
+ public int hashCode() {
+ int h = numElements;
+ return (h * 31 + keys.hashCode()) * 31 + values.hashCode();
+ }
+
+ @Override
+ public boolean equals(Object obj) {
+ if (obj instanceof UnsafeMapData) {
+ UnsafeMapData map = (UnsafeMapData) obj;
+ return numElements == map.numElements && keys.equals(map.keyArray())
+ && values.equals(map.valueArray());
--- End diff --
MapData is not like Map, it's ordered already, it could not support that.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org