You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2017/12/13 16:41:06 UTC
[GitHub] spark pull request #19970: [SPARK-22775][SQL] move dictionary related APIs f...
GitHub user cloud-fan opened a pull request:
https://github.com/apache/spark/pull/19970
[SPARK-22775][SQL] move dictionary related APIs from ColumnVector to WritableColumnVector
## What changes were proposed in this pull request?
These dictionary related APIs are special to `WritableColumnVector` and should not be in `ColumnVector`, which will be public soon.
## How was this patch tested?
existing tests
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cloud-fan/spark final
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19970.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19970
----
commit 103dca351952c0017b013bd5ea005a2fc0e2cb50
Author: Wenchen Fan <we...@databricks.com>
Date: 2017-12-13T16:37:40Z
move dictionary related APIs from ColumnVector to WritableColumnVector
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19970
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19970
**[Test build #84891 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84891/testReport)** for PR 19970 at commit [`c38f58e`](https://github.com/apache/spark/commit/c38f58e80ef5e8c4c3b732695ea94681f22d85b4).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19970
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19970
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19970
**[Test build #84903 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84903/testReport)** for PR 19970 at commit [`1de8de9`](https://github.com/apache/spark/commit/1de8de9cf01a53457047cb9bdc09b2ba655fd816).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19970
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19970
**[Test build #84871 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84871/testReport)** for PR 19970 at commit [`103dca3`](https://github.com/apache/spark/commit/103dca351952c0017b013bd5ea005a2fc0e2cb50).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19970: [SPARK-22775][SQL] move dictionary related APIs f...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19970#discussion_r156843445
--- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVector.java ---
@@ -145,39 +135,39 @@
public abstract double[] getDoubles(int rowId, int count);
/**
- * Returns the length of the array at rowid.
+ * Returns the length of the array for rowId.
*/
public abstract int getArrayLength(int rowId);
/**
- * Returns the offset of the array at rowid.
+ * Returns the offset of the array for rowId.
*/
public abstract int getArrayOffset(int rowId);
/**
- * Returns a utility object to get structs.
+ * Returns the struct for rowId.
*/
public final ColumnarRow getStruct(int rowId) {
return new ColumnarRow(this, rowId);
}
/**
- * Returns a utility object to get structs.
- * provided to keep API compatibility with InternalRow for code generation
+ * A special version of {@link #getShort(int)}, which is only used as an adapter for Spark codegen
--- End diff --
`getStruct(int)` instead of `getShort(int)`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19970: [SPARK-22775][SQL] move dictionary related APIs f...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/19970
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19970
**[Test build #84881 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84881/testReport)** for PR 19970 at commit [`103dca3`](https://github.com/apache/spark/commit/103dca351952c0017b013bd5ea005a2fc0e2cb50).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19970
**[Test build #84903 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84903/testReport)** for PR 19970 at commit [`1de8de9`](https://github.com/apache/spark/commit/1de8de9cf01a53457047cb9bdc09b2ba655fd816).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19970
**[Test build #84871 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84871/testReport)** for PR 19970 at commit [`103dca3`](https://github.com/apache/spark/commit/103dca351952c0017b013bd5ea005a2fc0e2cb50).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19970: [SPARK-22775][SQL] move dictionary related APIs f...
Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:
https://github.com/apache/spark/pull/19970#discussion_r156874906
--- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java ---
@@ -105,6 +107,57 @@ private void throwUnsupportedException(int requiredCapacity, Throwable cause) {
@Override
public boolean anyNullsSet() { return anyNullsSet; }
+ /**
+ * Returns the dictionary Id for rowId.
+ * This should only be called when the ColumnVector is dictionaryIds.
--- End diff --
`ColumnVector` -> `WritableColumnVector`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19970
thanks, merging to master!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19970
retest this pease
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19970
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84891/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19970
cc @kiszk @ueshin
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19970: [SPARK-22775][SQL] move dictionary related APIs f...
Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:
https://github.com/apache/spark/pull/19970#discussion_r156874968
--- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java ---
@@ -105,6 +107,57 @@ private void throwUnsupportedException(int requiredCapacity, Throwable cause) {
@Override
public boolean anyNullsSet() { return anyNullsSet; }
+ /**
+ * Returns the dictionary Id for rowId.
+ * This should only be called when the ColumnVector is dictionaryIds.
--- End diff --
`dictionaryIds has WritableColumnVector`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/19970
LGTM except a few comments for wording
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19970
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84903/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19970
**[Test build #84881 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84881/testReport)** for PR 19970 at commit [`103dca3`](https://github.com/apache/spark/commit/103dca351952c0017b013bd5ea005a2fc0e2cb50).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19970
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19970
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84871/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19970
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84881/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19970
**[Test build #84891 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84891/testReport)** for PR 19970 at commit [`c38f58e`](https://github.com/apache/spark/commit/c38f58e80ef5e8c4c3b732695ea94681f22d85b4).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org