You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2017/12/13 16:41:06 UTC

[GitHub] spark pull request #19970: [SPARK-22775][SQL] move dictionary related APIs f...

GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/19970

    [SPARK-22775][SQL] move dictionary related APIs from ColumnVector to WritableColumnVector

    ## What changes were proposed in this pull request?
    
    These dictionary related APIs are special to `WritableColumnVector` and should not be in `ColumnVector`, which will be public soon.
    
    ## How was this patch tested?
    
    existing tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark final

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19970.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19970
    
----
commit 103dca351952c0017b013bd5ea005a2fc0e2cb50
Author: Wenchen Fan <we...@databricks.com>
Date:   2017-12-13T16:37:40Z

    move dictionary related APIs from ColumnVector to WritableColumnVector

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    **[Test build #84891 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84891/testReport)** for PR 19970 at commit [`c38f58e`](https://github.com/apache/spark/commit/c38f58e80ef5e8c4c3b732695ea94681f22d85b4).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    **[Test build #84903 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84903/testReport)** for PR 19970 at commit [`1de8de9`](https://github.com/apache/spark/commit/1de8de9cf01a53457047cb9bdc09b2ba655fd816).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    **[Test build #84871 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84871/testReport)** for PR 19970 at commit [`103dca3`](https://github.com/apache/spark/commit/103dca351952c0017b013bd5ea005a2fc0e2cb50).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19970: [SPARK-22775][SQL] move dictionary related APIs f...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19970#discussion_r156843445
  
    --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVector.java ---
    @@ -145,39 +135,39 @@
       public abstract double[] getDoubles(int rowId, int count);
     
       /**
    -   * Returns the length of the array at rowid.
    +   * Returns the length of the array for rowId.
        */
       public abstract int getArrayLength(int rowId);
     
       /**
    -   * Returns the offset of the array at rowid.
    +   * Returns the offset of the array for rowId.
        */
       public abstract int getArrayOffset(int rowId);
     
       /**
    -   * Returns a utility object to get structs.
    +   * Returns the struct for rowId.
        */
       public final ColumnarRow getStruct(int rowId) {
         return new ColumnarRow(this, rowId);
       }
     
       /**
    -   * Returns a utility object to get structs.
    -   * provided to keep API compatibility with InternalRow for code generation
    +   * A special version of {@link #getShort(int)}, which is only used as an adapter for Spark codegen
    --- End diff --
    
    `getStruct(int)` instead of `getShort(int)`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19970: [SPARK-22775][SQL] move dictionary related APIs f...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19970


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    **[Test build #84881 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84881/testReport)** for PR 19970 at commit [`103dca3`](https://github.com/apache/spark/commit/103dca351952c0017b013bd5ea005a2fc0e2cb50).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    **[Test build #84903 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84903/testReport)** for PR 19970 at commit [`1de8de9`](https://github.com/apache/spark/commit/1de8de9cf01a53457047cb9bdc09b2ba655fd816).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    **[Test build #84871 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84871/testReport)** for PR 19970 at commit [`103dca3`](https://github.com/apache/spark/commit/103dca351952c0017b013bd5ea005a2fc0e2cb50).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19970: [SPARK-22775][SQL] move dictionary related APIs f...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19970#discussion_r156874906
  
    --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java ---
    @@ -105,6 +107,57 @@ private void throwUnsupportedException(int requiredCapacity, Throwable cause) {
       @Override
       public boolean anyNullsSet() { return anyNullsSet; }
     
    +  /**
    +   * Returns the dictionary Id for rowId.
    +   * This should only be called when the ColumnVector is dictionaryIds.
    --- End diff --
    
    `ColumnVector` -> `WritableColumnVector`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    thanks, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    retest this pease


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84891/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    cc @kiszk @ueshin


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19970: [SPARK-22775][SQL] move dictionary related APIs f...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19970#discussion_r156874968
  
    --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java ---
    @@ -105,6 +107,57 @@ private void throwUnsupportedException(int requiredCapacity, Throwable cause) {
       @Override
       public boolean anyNullsSet() { return anyNullsSet; }
     
    +  /**
    +   * Returns the dictionary Id for rowId.
    +   * This should only be called when the ColumnVector is dictionaryIds.
    --- End diff --
    
    `dictionaryIds has WritableColumnVector`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    LGTM except a few comments for wording


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84903/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    **[Test build #84881 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84881/testReport)** for PR 19970 at commit [`103dca3`](https://github.com/apache/spark/commit/103dca351952c0017b013bd5ea005a2fc0e2cb50).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84871/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84881/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19970: [SPARK-22775][SQL] move dictionary related APIs from Col...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19970
  
    **[Test build #84891 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84891/testReport)** for PR 19970 at commit [`c38f58e`](https://github.com/apache/spark/commit/c38f58e80ef5e8c4c3b732695ea94681f22d85b4).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org