You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by maropu <gi...@git.apache.org> on 2017/02/22 14:09:54 UTC

[GitHub] spark pull request #17028: [SPARK-19691][SQL] Fix ClassCastException when ca...

GitHub user maropu opened a pull request:

    https://github.com/apache/spark/pull/17028

    [SPARK-19691][SQL] Fix ClassCastException when calculating percentile of decimal column

    ## What changes were proposed in this pull request?
    This pr fixed a class-cast exception below;
    ```
    scala> spark.range(10).selectExpr("cast (id as decimal) as x").selectExpr("percentile(x, 0.5)").collect()
     java.lang.ClassCastException: org.apache.spark.sql.types.Decimal cannot be cast to java.lang.Number
    	at org.apache.spark.sql.catalyst.expressions.aggregate.Percentile.update(Percentile.scala:141)
    	at org.apache.spark.sql.catalyst.expressions.aggregate.Percentile.update(Percentile.scala:58)
    	at org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.update(interfaces.scala:514)
    	at org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$1$$anonfun$applyOrElse$1.apply(AggregationIterator.scala:171)
    	at org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$1$$anonfun$applyOrElse$1.apply(AggregationIterator.scala:171)
    	at org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$generateProcessRow$1.apply(AggregationIterator.scala:187)
    	at org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$generateProcessRow$1.apply(AggregationIterator.scala:181)
    	at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.processInputs(ObjectAggregationIterator.scala:151)
    	at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.<init>(ObjectAggregationIterator.scala:78)
    	at org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec$$anonfun$doExecute$1$$anonfun$2.apply(ObjectHashAggregateExec.scala:109)
    	at 
    ```
    This fix simply converts catalyst values (i.e., `Decimal`) into scala ones by using `CatalystTypeConverters`.
    
    ## How was this patch tested?
    Added a test in `DataFrameSuite`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/maropu/spark SPARK-19691

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17028.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17028
    
----
commit b216fa104a17b076f246ef236a6deab94ed16246
Author: Takeshi Yamamuro <ya...@apache.org>
Date:   2017-02-22T14:03:20Z

    Fix ClassCastException

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17028: [SPARK-19691][SQL] Fix ClassCastException when ca...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17028#discussion_r102492855
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Percentile.scala ---
    @@ -138,7 +138,8 @@ case class Percentile(
       override def update(
           buffer: OpenHashMap[Number, Long],
           input: InternalRow): OpenHashMap[Number, Long] = {
    -    val key = child.eval(input).asInstanceOf[Number]
    +    val scalaValue = CatalystTypeConverters.convertToScala(child.eval(input), child.dataType)
    --- End diff --
    
    okay, I'll fix that way. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    **[Test build #73342 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73342/testReport)** for PR 17028 at commit [`88f4f47`](https://github.com/apache/spark/commit/88f4f47bae435f8cebe6d3e0ad31b3a77516014b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17028: [SPARK-19691][SQL] Fix ClassCastException when ca...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17028#discussion_r102679493
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/aggregate/PercentileSuite.scala ---
    @@ -39,44 +38,44 @@ class PercentileSuite extends SparkFunSuite {
         val agg = new Percentile(BoundReference(0, IntegerType, true), Literal(0.5))
     
         // Check empty serialize and deserialize
    -    val buffer = new OpenHashMap[Number, Long]()
    +    val buffer = new OpenHashMap[AnyRef, Long]()
         assert(compareEquals(agg.deserialize(agg.serialize(buffer)), buffer))
     
         // Check non-empty buffer serializa and deserialize.
         data.foreach { key =>
    -      buffer.changeValue(key, 1L, _ + 1L)
    +      buffer.changeValue(new Integer(key), 1L, _ + 1L)
         }
         assert(compareEquals(agg.deserialize(agg.serialize(buffer)), buffer))
       }
     
       test("class Percentile, high level interface, update, merge, eval...") {
         val count = 10000
         val percentages = Seq(0, 0.25, 0.5, 0.75, 1)
    -    val expectedPercentiles = Seq(1, 2500.75, 5000.5, 7500.25, 10000)
    +    val expectedPercentiles = Seq[Double](1, 2500.75, 5000.5, 7500.25, 10000)
    --- End diff --
    
    Since my Intellij makes an alert on this part, I added this. But, you're right and this is not necessary. I reverted this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17028: [SPARK-19691][SQL] Fix ClassCastException when ca...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17028#discussion_r102490587
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Percentile.scala ---
    @@ -138,7 +138,8 @@ case class Percentile(
       override def update(
           buffer: OpenHashMap[Number, Long],
           input: InternalRow): OpenHashMap[Number, Long] = {
    -    val key = child.eval(input).asInstanceOf[Number]
    +    val scalaValue = CatalystTypeConverters.convertToScala(child.eval(input), child.dataType)
    --- End diff --
    
    I think it is better to open up the signature of the `OpenHashMap` and use `Ordered` or `AnyRef` as its key type.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    Done. I wait for tests finished.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    Just a sec, I'll apply the @hvanhovell suggestion...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17028: [SPARK-19691][SQL] Fix ClassCastException when ca...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/17028


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17028: [SPARK-19691][SQL] Fix ClassCastException when ca...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17028#discussion_r102673905
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/aggregate/PercentileSuite.scala ---
    @@ -39,44 +38,44 @@ class PercentileSuite extends SparkFunSuite {
         val agg = new Percentile(BoundReference(0, IntegerType, true), Literal(0.5))
     
         // Check empty serialize and deserialize
    -    val buffer = new OpenHashMap[Number, Long]()
    +    val buffer = new OpenHashMap[AnyRef, Long]()
         assert(compareEquals(agg.deserialize(agg.serialize(buffer)), buffer))
     
         // Check non-empty buffer serializa and deserialize.
         data.foreach { key =>
    -      buffer.changeValue(key, 1L, _ + 1L)
    +      buffer.changeValue(new Integer(key), 1L, _ + 1L)
    --- End diff --
    
    To we need to explicitly type this? I thoughtscala boxed automatically.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    @maropu can you open a backport if you feel we should also put this in 2.1?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    **[Test build #73342 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73342/testReport)** for PR 17028 at commit [`88f4f47`](https://github.com/apache/spark/commit/88f4f47bae435f8cebe6d3e0ad31b3a77516014b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17028: [SPARK-19691][SQL] Fix ClassCastException when ca...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17028#discussion_r102673907
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Percentile.scala ---
    @@ -274,7 +283,8 @@ case class Percentile(
             val row = new UnsafeRow(2)
             row.pointTo(bs, sizeOfNextRow)
             // Insert the pairs into counts map.
    -        val key = row.get(0, child.dataType).asInstanceOf[Number]
    +        val catalystValue = row.get(0, child.dataType)
    --- End diff --
    
    oh..., I'll fix


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17028: [SPARK-19691][SQL] Fix ClassCastException when ca...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17028#discussion_r102673129
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Percentile.scala ---
    @@ -130,20 +130,30 @@ case class Percentile(
         }
       }
     
    -  override def createAggregationBuffer(): OpenHashMap[Number, Long] = {
    +  private def toLongValue(d: Any): Long = d match {
    +    case d: Decimal => d.toLong
    +    case n: Number => n.longValue
    +  }
    +
    +  private def toDoubleValue(d: Any): Double = d match {
    +    case d: Decimal => d.toDouble
    +    case n: Number => n.doubleValue
    +  }
    +
    +  override def createAggregationBuffer(): OpenHashMap[AnyRef, Long] = {
         // Initialize new counts map instance here.
    -    new OpenHashMap[Number, Long]()
    +    new OpenHashMap[AnyRef, Long]()
       }
     
       override def update(
    -      buffer: OpenHashMap[Number, Long],
    -      input: InternalRow): OpenHashMap[Number, Long] = {
    -    val key = child.eval(input).asInstanceOf[Number]
    +      buffer: OpenHashMap[AnyRef, Long],
    +      input: InternalRow): OpenHashMap[AnyRef, Long] = {
    +    val key = child.eval(input).asInstanceOf[AnyRef]
         val frqValue = frequencyExpression.eval(input)
     
         // Null values are ignored in counts map.
         if (key != null && frqValue != null) {
    -      val frqLong = frqValue.asInstanceOf[Number].longValue()
    +      val frqLong = toLongValue(frqValue)
    --- End diff --
    
    `frqValue` is guaranteed to return a integral value. So this is not needed. We could also force it to be a Long, that would make this even simpler.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17028: [SPARK-19691][SQL] Fix ClassCastException when ca...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17028#discussion_r102517336
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Percentile.scala ---
    @@ -138,7 +138,8 @@ case class Percentile(
       override def update(
           buffer: OpenHashMap[Number, Long],
           input: InternalRow): OpenHashMap[Number, Long] = {
    -    val key = child.eval(input).asInstanceOf[Number]
    +    val scalaValue = CatalystTypeConverters.convertToScala(child.eval(input), child.dataType)
    --- End diff --
    
    I'm not 100% sure though, it seems the cost of converting `Decimal` to `BigDecimal` every-time is some higher than that of using catalyst values as it is.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    @hvanhovell oka, I'll open soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    **[Test build #73320 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73320/testReport)** for PR 17028 at commit [`ef26f26`](https://github.com/apache/spark/commit/ef26f262cc747505cb0d2a55d6ee0c531263ac0a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    LGTM - merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17028: [SPARK-19691][SQL] Fix ClassCastException when ca...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17028#discussion_r102676907
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/aggregate/PercentileSuite.scala ---
    @@ -39,44 +38,44 @@ class PercentileSuite extends SparkFunSuite {
         val agg = new Percentile(BoundReference(0, IntegerType, true), Literal(0.5))
     
         // Check empty serialize and deserialize
    -    val buffer = new OpenHashMap[Number, Long]()
    +    val buffer = new OpenHashMap[AnyRef, Long]()
         assert(compareEquals(agg.deserialize(agg.serialize(buffer)), buffer))
     
         // Check non-empty buffer serializa and deserialize.
         data.foreach { key =>
    -      buffer.changeValue(key, 1L, _ + 1L)
    +      buffer.changeValue(new Integer(key), 1L, _ + 1L)
    --- End diff --
    
    If this boxing does not exist, it throws an exception below;
    ```
    [error] /Users/maropu/IdeaProjects/spark/spark-master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Perc
    entileSuite.scala:46: the result type of an implicit conversion must be more specific than AnyRef
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73290/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73280/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    **[Test build #73280 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73280/testReport)** for PR 17028 at commit [`b216fa1`](https://github.com/apache/spark/commit/b216fa104a17b076f246ef236a6deab94ed16246).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73342/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17028: [SPARK-19691][SQL] Fix ClassCastException when ca...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17028#discussion_r102673567
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Percentile.scala ---
    @@ -274,7 +283,8 @@ case class Percentile(
             val row = new UnsafeRow(2)
             row.pointTo(bs, sizeOfNextRow)
             // Insert the pairs into counts map.
    -        val key = row.get(0, child.dataType).asInstanceOf[Number]
    +        val catalystValue = row.get(0, child.dataType)
    --- End diff --
    
    NIT: Just change the cast in the old code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73320/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17028: [SPARK-19691][SQL] Fix ClassCastException when ca...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17028#discussion_r102674655
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Percentile.scala ---
    @@ -130,20 +130,30 @@ case class Percentile(
         }
       }
     
    -  override def createAggregationBuffer(): OpenHashMap[Number, Long] = {
    +  private def toLongValue(d: Any): Long = d match {
    +    case d: Decimal => d.toLong
    +    case n: Number => n.longValue
    +  }
    +
    +  private def toDoubleValue(d: Any): Double = d match {
    +    case d: Decimal => d.toDouble
    +    case n: Number => n.doubleValue
    +  }
    +
    +  override def createAggregationBuffer(): OpenHashMap[AnyRef, Long] = {
         // Initialize new counts map instance here.
    -    new OpenHashMap[Number, Long]()
    +    new OpenHashMap[AnyRef, Long]()
       }
     
       override def update(
    -      buffer: OpenHashMap[Number, Long],
    -      input: InternalRow): OpenHashMap[Number, Long] = {
    -    val key = child.eval(input).asInstanceOf[Number]
    +      buffer: OpenHashMap[AnyRef, Long],
    +      input: InternalRow): OpenHashMap[AnyRef, Long] = {
    +    val key = child.eval(input).asInstanceOf[AnyRef]
         val frqValue = frequencyExpression.eval(input)
     
         // Null values are ignored in counts map.
         if (key != null && frqValue != null) {
    -      val frqLong = frqValue.asInstanceOf[Number].longValue()
    +      val frqLong = toLongValue(frqValue)
    --- End diff --
    
    I'll revert this part.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17028: [SPARK-19691][SQL] Fix ClassCastException when ca...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17028#discussion_r102508147
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Percentile.scala ---
    @@ -138,7 +138,8 @@ case class Percentile(
       override def update(
           buffer: OpenHashMap[Number, Long],
           input: InternalRow): OpenHashMap[Number, Long] = {
    -    val key = child.eval(input).asInstanceOf[Number]
    +    val scalaValue = CatalystTypeConverters.convertToScala(child.eval(input), child.dataType)
    --- End diff --
    
    Should we create a converter and re-use it with `createToScalaConverter(...)` rather than type-dispatching every time?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17028: [SPARK-19691][SQL] Fix ClassCastException when ca...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17028#discussion_r102673952
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/aggregate/PercentileSuite.scala ---
    @@ -39,44 +38,44 @@ class PercentileSuite extends SparkFunSuite {
         val agg = new Percentile(BoundReference(0, IntegerType, true), Literal(0.5))
     
         // Check empty serialize and deserialize
    -    val buffer = new OpenHashMap[Number, Long]()
    +    val buffer = new OpenHashMap[AnyRef, Long]()
         assert(compareEquals(agg.deserialize(agg.serialize(buffer)), buffer))
     
         // Check non-empty buffer serializa and deserialize.
         data.foreach { key =>
    -      buffer.changeValue(key, 1L, _ + 1L)
    +      buffer.changeValue(new Integer(key), 1L, _ + 1L)
         }
         assert(compareEquals(agg.deserialize(agg.serialize(buffer)), buffer))
       }
     
       test("class Percentile, high level interface, update, merge, eval...") {
         val count = 10000
         val percentages = Seq(0, 0.25, 0.5, 0.75, 1)
    -    val expectedPercentiles = Seq(1, 2500.75, 5000.5, 7500.25, 10000)
    +    val expectedPercentiles = Seq[Double](1, 2500.75, 5000.5, 7500.25, 10000)
    --- End diff --
    
    Do we need to type this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    **[Test build #73280 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73280/testReport)** for PR 17028 at commit [`b216fa1`](https://github.com/apache/spark/commit/b216fa104a17b076f246ef236a6deab94ed16246).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    @maropu This looks pretty good. I left a few minor comments/questions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    **[Test build #73290 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73290/testReport)** for PR 17028 at commit [`325c95d`](https://github.com/apache/spark/commit/325c95d58b0d4c801a218f497b482f619f4c1114).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    **[Test build #73320 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73320/testReport)** for PR 17028 at commit [`ef26f26`](https://github.com/apache/spark/commit/ef26f262cc747505cb0d2a55d6ee0c531263ac0a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    @HyukjinKwon @hvanhovell How about the latest fix?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    Thanks for your review! I'll fix now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    **[Test build #73290 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73290/testReport)** for PR 17028 at commit [`325c95d`](https://github.com/apache/spark/commit/325c95d58b0d4c801a218f497b482f619f4c1114).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17028: [SPARK-19691][SQL] Fix ClassCastException when calculati...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17028
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org