You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by kiszk <gi...@git.apache.org> on 2018/08/08 18:49:24 UTC

[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...

GitHub user kiszk opened a pull request:

    https://github.com/apache/spark/pull/22044

    [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

    ## What changes were proposed in this pull request?
    
    This PR simplified code generation for `ArrayDistinct`. #21966 enabled code generation only if the type can be specialized by the hash set. This PR follows this strategy.
    
    Optimization of null handling will be implemented in #21912.
    
    ## How was this patch tested?
    
    Existing UTs

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kiszk/spark SPARK-23912-follow

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22044.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22044
    
----
commit 9d0cb100a85dc77569e1a62b83e0461ee2c33ddb
Author: Kazuaki Ishizaki <is...@...>
Date:   2018-08-08T18:43:51Z

    stop code generation when elementTypeSupportEquals is false

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94461/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94487/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94475/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22044#discussion_r208855525
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -3410,6 +3410,28 @@ case class ArrayDistinct(child: Expression)
         case _ => false
       }
     
    +  @transient protected lazy val canUseSpecializedHashSet = elementType match {
    --- End diff --
    
    Can we extract those common methods?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22044


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    **[Test build #94475 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94475/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Thanks! merging to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1995/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1994/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    **[Test build #94450 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94450/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    **[Test build #94450 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94450/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94450/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    LGTM pending Jenkins.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1981/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    **[Test build #94488 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94488/testReport)** for PR 22044 at commit [`ec262ab`](https://github.com/apache/spark/commit/ec262abf69fccdc8d922b6643a0fc8605b5d3294).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    **[Test build #94475 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94475/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94488/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    **[Test build #94471 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94471/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1972/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    **[Test build #94461 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94461/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    **[Test build #94487 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94487/testReport)** for PR 22044 at commit [`89d0664`](https://github.com/apache/spark/commit/89d0664684875deeb8da29c4d45a7076a5e07bb4).
     * This patch **fails MiMa tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94471/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1984/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    **[Test build #94488 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94488/testReport)** for PR 22044 at commit [`ec262ab`](https://github.com/apache/spark/commit/ec262abf69fccdc8d922b6643a0fc8605b5d3294).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    **[Test build #94471 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94471/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22044#discussion_r208854615
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -3517,56 +3510,24 @@ case class ArrayDistinct(child: Expression)
         """.stripMargin
       }
     
    -  private def setNotNullValue(isPrimitive: Boolean,
    +  private def setNotNullValue(
    --- End diff --
    
    We can inline this?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22044#discussion_r208856510
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -3517,56 +3510,24 @@ case class ArrayDistinct(child: Expression)
         """.stripMargin
       }
     
    -  private def setNotNullValue(isPrimitive: Boolean,
    +  private def setNotNullValue(
           distinctArray: String,
           pos: String,
           getValue1: String,
           primitiveValueTypeName: String): String = {
    -    if (!isPrimitive) {
    -      s"$distinctArray[$pos] = $getValue1";
    -    } else {
    -      s"$distinctArray.set$primitiveValueTypeName($pos, $getValue1)";
    -    }
    +    s"$distinctArray.set$primitiveValueTypeName($pos, $getValue1)"
       }
     
       private def setValueForFastEval(
    --- End diff --
    
    Maybe we should rename this.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22044#discussion_r208873100
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -3442,17 +3464,15 @@ case class ArrayDistinct(child: Expression)
       }
     
       override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
    -    nullSafeCodeGen(ctx, ev, (array) => {
    -      val i = ctx.freshName("i")
    -      val j = ctx.freshName("j")
    -      val sizeOfDistinctArray = ctx.freshName("sizeOfDistinctArray")
    -      val getValue1 = CodeGenerator.getValue(array, elementType, i)
    -      val getValue2 = CodeGenerator.getValue(array, elementType, j)
    -      val foundNullElement = ctx.freshName("foundNullElement")
    -      val openHashSet = classOf[OpenHashSet[_]].getName
    -      val hs = ctx.freshName("hs")
    -      val classTag = s"scala.reflect.ClassTag$$.MODULE$$.Object()"
    -      if (elementTypeSupportEquals) {
    +    if (canUseSpecializedHashSet) {
    +      nullSafeCodeGen(ctx, ev, (array) => {
    +        val i = ctx.freshName("i")
    +        val sizeOfDistinctArray = ctx.freshName("sizeOfDistinctArray")
    +        val foundNullElement = ctx.freshName("foundNullElement")
    +        val openHashSet = classOf[OpenHashSet[_]].getName
    +        val hs = ctx.freshName("hs")
    +        val classTag = s"scala.reflect.ClassTag$$.MODULE$$.$hsTypeName()"
    +        val getValue1 = CodeGenerator.getValue(array, elementType, i)
    --- End diff --
    
    nit: `val getValue`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22044#discussion_r208862917
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -3410,6 +3410,28 @@ case class ArrayDistinct(child: Expression)
         case _ => false
       }
     
    +  @transient protected lazy val canUseSpecializedHashSet = elementType match {
    --- End diff --
    
    We can do so. To minimize the changes due to remaining time for cutting, I would like to do this in another PR #21912.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    **[Test build #94487 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94487/testReport)** for PR 22044 at commit [`89d0664`](https://github.com/apache/spark/commit/89d0664684875deeb8da29c4d45a7076a5e07bb4).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22044#discussion_r208849675
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
    @@ -3410,6 +3410,28 @@ case class ArrayDistinct(child: Expression)
         case _ => false
       }
     
    +  @transient protected lazy val canUseSpecializedHashSet = elementType match {
    +    case ByteType | ShortType | IntegerType | LongType | FloatType | DoubleType => true
    +    case _ => false
    +  }
    +
    +  @transient protected lazy val (hsPostFix, hsTypeName) = {
    +    val ptName = CodeGenerator.primitiveTypeName (elementType)
    --- End diff --
    
    nit: extra space after `primitiveTypeName`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1966/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22044
  
    **[Test build #94461 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94461/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org