You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by tejasapatil <gi...@git.apache.org> on 2017/02/24 07:38:48 UTC

[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

GitHub user tejasapatil opened a pull request:

    https://github.com/apache/spark/pull/17056

    [SPARK-17495] [SQL] Support Decimal type in Hive-hash

    ## What changes were proposed in this pull request?
    
    Hive hash to support Decimal datatype. [Hive internally normalises decimals](https://github.com/apache/hive/blob/4ba713ccd85c3706d195aeef9476e6e6363f1c21/storage-api/src/java/org/apache/hadoop/hive/common/type/HiveDecimalV1.java#L307) and I have ported that logic as-is to HiveHash.
    
    Generated code (in case any reviewer wants to examine):
    
    ```
    /* 031 */   protected void processNext() throws java.io.IOException {
    /* 032 */     while (inputadapter_input.hasNext() && !stopEarly()) {
    /* 033 */       InternalRow inputadapter_row = (InternalRow) inputadapter_input.next();
    /* 034 */       project_value = 0;
    /* 035 */
    /* 036 */       boolean inputadapter_isNull = inputadapter_row.isNullAt(0);
    /* 037 */       Decimal inputadapter_value = inputadapter_isNull ? null : (inputadapter_row.getDecimal(0, 38, 0));
    /* 038 */       if (!inputadapter_isNull) {
    /* 039 */         project_childHash = org.apache.spark.sql.catalyst.expressions.HiveHashFunction.normalizeDecimal(
    /* 040 */           inputadapter_value.toJavaBigDecimal(), true).hashCode();
    /* 041 */       }
    /* 042 */       project_value = (31 * project_value) + project_childHash;
    /* 043 */       project_childHash = 0;
    /* 044 */       project_rowWriter.write(0, project_value);
    /* 045 */       append(project_result);
    /* 046 */       if (shouldStop()) return;
    /* 047 */     }
    /* 048 */   }
    ```
    
    ## How was this patch tested?
    
    Added unit tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tejasapatil/spark SPARK-17495_decimal

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17056.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17056
    
----
commit a378b3ef08cead4c915096f11de5bd371a405fef
Author: Tejas Patil <te...@fb.com>
Date:   2017-02-24T07:35:16Z

    [SPARK-17495] [SQL] Support Decimal type

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    **[Test build #73460 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73460/testReport)** for PR 17056 at commit [`8595305`](https://github.com/apache/spark/commit/8595305c2dc3b276d6390724ca1f1469794540f5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    LGTM. cc @cloud-fan for final signing-off


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103616585
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala ---
    @@ -732,6 +743,51 @@ object HiveHashFunction extends InterpretedHashFunction {
         HiveHasher.hashUnsafeBytes(base, offset, len)
       }
     
    +  private val HiveDecimalMaxPrecision = 38
    +  private val HiveDecimalMaxScale = 38
    +
    +  // Mimics normalization done for decimals in Hive at HiveDecimalV1.normalize()
    +  def normalizeDecimal(input: BigDecimal, allowRounding: Boolean): BigDecimal = {
    +    if (input == null) {
    +      return null
    +    }
    +
    +    def trimDecimal(input: BigDecimal) = {
    +      var result = input
    +      if (result.compareTo(BigDecimal.ZERO) == 0) {
    +        // Special case for 0, because java doesn't strip zeros correctly on that number.
    +        result = BigDecimal.ZERO
    +      }
    +      else {
    --- End diff --
    
    `} else {`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    **[Test build #73610 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73610/testReport)** for PR 17056 at commit [`428a9a4`](https://github.com/apache/spark/commit/428a9a476a37d3969cf8038cb3ce10b08066fcde).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    **[Test build #73610 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73610/testReport)** for PR 17056 at commit [`428a9a4`](https://github.com/apache/spark/commit/428a9a476a37d3969cf8038cb3ce10b08066fcde).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    **[Test build #74018 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74018/testReport)** for PR 17056 at commit [`7c0b6c8`](https://github.com/apache/spark/commit/7c0b6c849bb2b3869a9c91560d130bb884e1532b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73410/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r104221836
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala ---
    @@ -732,6 +743,48 @@ object HiveHashFunction extends InterpretedHashFunction {
         HiveHasher.hashUnsafeBytes(base, offset, len)
       }
     
    +  private val HiveDecimalMaxPrecision = 38
    +  private val HiveDecimalMaxScale = 38
    +
    +  // Mimics normalization done for decimals in Hive at HiveDecimalV1.normalize()
    +  def normalizeDecimal(input: BigDecimal, allowRounding: Boolean): BigDecimal = {
    --- End diff --
    
    allowRounding will never be false?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r104327401
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala ---
    @@ -732,6 +743,48 @@ object HiveHashFunction extends InterpretedHashFunction {
         HiveHasher.hashUnsafeBytes(base, offset, len)
       }
     
    +  private val HiveDecimalMaxPrecision = 38
    --- End diff --
    
    renamed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73945/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    **[Test build #73945 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73945/testReport)** for PR 17056 at commit [`65a09e9`](https://github.com/apache/spark/commit/65a09e940484b64212262fe17888ede6c5d8cc14).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103503189
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HashExpressionsSuite.scala ---
    @@ -371,6 +370,48 @@ class HashExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
             new StructType().add("array", arrayOfString).add("map", mapOfString))
           .add("structOfUDT", structOfUDT))
     
    +  test("hive-hash for decimal") {
    +    def checkHiveHashForDecimal(
    +        input: String,
    +        precision: Int,
    +        scale: Int,
    +        expected: Long): Unit = {
    +      val decimal = Decimal.apply(new java.math.BigDecimal(input))
    +      decimal.changePrecision(precision, scale)
    +      val decimalType = DataTypes.createDecimalType(precision, scale)
    +      checkHiveHash(decimal, decimalType, expected)
    +    }
    +
    +    checkHiveHashForDecimal("18", 38, 0, 558)
    +    checkHiveHashForDecimal("-18", 38, 0, -558)
    +    checkHiveHashForDecimal("-18", 38, 12, -558)
    +    checkHiveHashForDecimal("18446744073709001000", 38, 19, -17070057)
    --- End diff --
    
    I figured out the problem was with the test case being not looking at the result of `decimal.changePrecision`. Fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103384029
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HashExpressionsSuite.scala ---
    @@ -371,6 +370,48 @@ class HashExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
             new StructType().add("array", arrayOfString).add("map", mapOfString))
           .add("structOfUDT", structOfUDT))
     
    +  test("hive-hash for decimal") {
    +    def checkHiveHashForDecimal(
    +        input: String,
    +        precision: Int,
    +        scale: Int,
    +        expected: Long): Unit = {
    +      val decimal = Decimal.apply(new java.math.BigDecimal(input))
    +      decimal.changePrecision(precision, scale)
    +      val decimalType = DataTypes.createDecimalType(precision, scale)
    +      checkHiveHash(decimal, decimalType, expected)
    +    }
    +
    +    checkHiveHashForDecimal("18", 38, 0, 558)
    --- End diff --
    
    These were generated over Hive 1.2.1. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73460/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    @gatorsmile : I really appreciate your help in reviewing this PR to the extent that you are manually checking the hashes over Hive. If you haven't already embarked on that, here is the set of hive queries corresponding to the test case in the PR which you can easily copy paste:
    
    ```
    SELECT HASH(CAST(18BD AS DECIMAL(38, 0)));
    SELECT HASH(CAST(-18BD AS DECIMAL(38, 0)));
    SELECT HASH(CAST(-18BD AS DECIMAL(38, 12)));
    SELECT HASH(CAST(18446744073709001000BD AS DECIMAL(38, 19)));
    SELECT HASH(CAST(-18446744073709001000BD AS DECIMAL(38, 22)));
    SELECT HASH(CAST(-18446744073709001000BD AS DECIMAL(38, 3)));
    SELECT HASH(CAST(18446744073709001000BD AS DECIMAL(38, 4)));
    SELECT HASH(CAST(9223372036854775807BD AS DECIMAL(38, 4)));
    SELECT HASH(CAST(-9223372036854775807BD AS DECIMAL(38, 5)));
    SELECT HASH(CAST(00000.00000000000BD AS DECIMAL(38, 34)));
    SELECT HASH(CAST(-00000.00000000000BD AS DECIMAL(38, 11)));
    SELECT HASH(CAST(123456.1234567890BD AS DECIMAL(38, 2)));
    SELECT HASH(CAST(123456.1234567890BD AS DECIMAL(38, 20)));
    SELECT HASH(CAST(123456.1234567890BD AS DECIMAL(38, 10)));
    SELECT HASH(CAST(-123456.1234567890BD AS DECIMAL(38, 10)));
    SELECT HASH(CAST(123456.1234567890BD AS DECIMAL(38, 0)));
    SELECT HASH(CAST(-123456.1234567890BD AS DECIMAL(38, 0)));
    SELECT HASH(CAST(123456.1234567890BD AS DECIMAL(38, 20)));
    SELECT HASH(CAST(-123456.1234567890BD AS DECIMAL(38, 20)));
    SELECT HASH(CAST(123456.123456789012345678901234567890BD AS DECIMAL(38, 0)));
    SELECT HASH(CAST(-123456.123456789012345678901234567890BD AS DECIMAL(38, 0)));
    SELECT HASH(CAST(123456.123456789012345678901234567890BD AS DECIMAL(38, 10)));
    SELECT HASH(CAST(-123456.123456789012345678901234567890BD AS DECIMAL(38, 10)));
    SELECT HASH(CAST(123456.123456789012345678901234567890BD AS DECIMAL(38, 20)));
    SELECT HASH(CAST(-123456.123456789012345678901234567890BD AS DECIMAL(38, 20)));
    SELECT HASH(CAST(123456.123456789012345678901234567890BD AS DECIMAL(38, 30)));
    SELECT HASH(CAST(-123456.123456789012345678901234567890BD AS DECIMAL(38, 30)));
    SELECT HASH(CAST(123456.123456789012345678901234567890BD AS DECIMAL(38, 31)));
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103618856
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala ---
    @@ -732,6 +743,51 @@ object HiveHashFunction extends InterpretedHashFunction {
         HiveHasher.hashUnsafeBytes(base, offset, len)
       }
     
    +  private val HiveDecimalMaxPrecision = 38
    +  private val HiveDecimalMaxScale = 38
    +
    +  // Mimics normalization done for decimals in Hive at HiveDecimalV1.normalize()
    +  def normalizeDecimal(input: BigDecimal, allowRounding: Boolean): BigDecimal = {
    +    if (input == null) {
    +      return null
    +    }
    --- End diff --
    
    changed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    @cloud-fan ping !!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    cc @cloud-fan @gatorsmile  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103097851
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HashExpressionsSuite.scala ---
    @@ -371,6 +370,48 @@ class HashExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
             new StructType().add("array", arrayOfString).add("map", mapOfString))
           .add("structOfUDT", structOfUDT))
     
    +  test("hive-hash for decimal") {
    +    def checkHiveHashForDecimal(
    +        input: String,
    +        precision: Int,
    +        scale: Int,
    +        expected: Long): Unit = {
    +      val decimal = Decimal.apply(new java.math.BigDecimal(input))
    +      decimal.changePrecision(precision, scale)
    +      val decimalType = DataTypes.createDecimalType(precision, scale)
    +      checkHiveHash(decimal, decimalType, expected)
    +    }
    +
    +    checkHiveHashForDecimal("18", 38, 0, 558)
    +    checkHiveHashForDecimal("-18", 38, 0, -558)
    +    checkHiveHashForDecimal("-18", 38, 12, -558)
    +    checkHiveHashForDecimal("18446744073709001000", 38, 19, -17070057)
    --- End diff --
    
    ```
    hive> select HASH(CAST("-18446744073709001000" AS DECIMAL(38,19)));
    OK
    0
    Time taken: 0.035 seconds, Fetched: 1 row(s)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103555753
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala ---
    @@ -635,6 +636,16 @@ case class HiveHash(children: Seq[Expression]) extends HashExpression[Int] {
       override protected def genHashBytes(b: String, result: String): String =
         s"$result = $hasherClassName.hashUnsafeBytes($b, Platform.BYTE_ARRAY_OFFSET, $b.length);"
     
    +  override protected def genHashDecimal(
    +      ctx: CodegenContext,
    +      d: DecimalType,
    +      input: String,
    +      result: String): String = {
    +    s"""
    +      $result = org.apache.spark.sql.catalyst.expressions.HiveHashFunction.normalizeDecimal(
    --- End diff --
    
    `HiveHashFunction` is an object so cannot do `classOf[]`. Tried `HiveHashFunction.getClass.getName` as per other places in the codebase


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103616628
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala ---
    @@ -732,6 +743,51 @@ object HiveHashFunction extends InterpretedHashFunction {
         HiveHasher.hashUnsafeBytes(base, offset, len)
       }
     
    +  private val HiveDecimalMaxPrecision = 38
    +  private val HiveDecimalMaxScale = 38
    +
    +  // Mimics normalization done for decimals in Hive at HiveDecimalV1.normalize()
    +  def normalizeDecimal(input: BigDecimal, allowRounding: Boolean): BigDecimal = {
    +    if (input == null) {
    +      return null
    +    }
    --- End diff --
    
    Nit: `if (input == null) return null`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    **[Test build #73460 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73460/testReport)** for PR 17056 at commit [`8595305`](https://github.com/apache/spark/commit/8595305c2dc3b276d6390724ca1f1469794540f5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    **[Test build #73410 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73410/testReport)** for PR 17056 at commit [`a378b3e`](https://github.com/apache/spark/commit/a378b3ef08cead4c915096f11de5bd371a405fef).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    **[Test build #73595 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73595/testReport)** for PR 17056 at commit [`2387515`](https://github.com/apache/spark/commit/2387515f8c3e2a7aeb5022e50891d1c534444a31).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    @cloud-fan @gatorsmile : can you please review this PR ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    **[Test build #73945 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73945/testReport)** for PR 17056 at commit [`65a09e9`](https://github.com/apache/spark/commit/65a09e940484b64212262fe17888ede6c5d8cc14).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73595/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103538320
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HashExpressionsSuite.scala ---
    @@ -371,6 +370,51 @@ class HashExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
             new StructType().add("array", arrayOfString).add("map", mapOfString))
           .add("structOfUDT", structOfUDT))
     
    +  test("hive-hash for decimal") {
    +    def checkHiveHashForDecimal(
    +        input: String,
    +        precision: Int,
    +        scale: Int,
    +        expected: Long): Unit = {
    +      val decimalType = DataTypes.createDecimalType(precision, scale)
    +      val decimal = {
    +        val value = Decimal.apply(new java.math.BigDecimal(input))
    +        if (value.changePrecision(precision, scale)) value else null
    +      }
    +
    +      checkHiveHash(decimal, decimalType, expected)
    +    }
    +
    +    checkHiveHashForDecimal("18", 38, 0, 558)
    --- End diff --
    
    hmm it's hard to guarantee that we can produce same hash value as hive, can we run hive in the test and compare the result with spark?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    **[Test build #73670 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73670/testReport)** for PR 17056 at commit [`c0c8390`](https://github.com/apache/spark/commit/c0c8390e0bcb706c474db66b6326ee403cc6c58c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103537711
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala ---
    @@ -635,6 +636,16 @@ case class HiveHash(children: Seq[Expression]) extends HashExpression[Int] {
       override protected def genHashBytes(b: String, result: String): String =
         s"$result = $hasherClassName.hashUnsafeBytes($b, Platform.BYTE_ARRAY_OFFSET, $b.length);"
     
    +  override protected def genHashDecimal(
    +      ctx: CodegenContext,
    +      d: DecimalType,
    +      input: String,
    +      result: String): String = {
    +    s"""
    +      $result = org.apache.spark.sql.catalyst.expressions.HiveHashFunction.normalizeDecimal(
    --- End diff --
    
    `${classOf[HiveHashFunction].getName}.normalizeDecimal`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Jenkins test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Let me manually check whether the results are consistent with Hive.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    **[Test build #73596 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73596/testReport)** for PR 17056 at commit [`2387515`](https://github.com/apache/spark/commit/2387515f8c3e2a7aeb5022e50891d1c534444a31).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74018/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Thank you! I checked Hive 2.1. It has the exactly same hash values. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73610/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73670/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r104222249
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala ---
    @@ -732,6 +743,48 @@ object HiveHashFunction extends InterpretedHashFunction {
         HiveHasher.hashUnsafeBytes(base, offset, len)
       }
     
    +  private val HiveDecimalMaxPrecision = 38
    --- End diff --
    
    nit: `HIVE_DECIMAL_MAX_PRECISION`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    **[Test build #73670 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73670/testReport)** for PR 17056 at commit [`c0c8390`](https://github.com/apache/spark/commit/c0c8390e0bcb706c474db66b6326ee403cc6c58c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/17056


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103096516
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HashExpressionsSuite.scala ---
    @@ -371,6 +370,48 @@ class HashExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
             new StructType().add("array", arrayOfString).add("map", mapOfString))
           .add("structOfUDT", structOfUDT))
     
    +  test("hive-hash for decimal") {
    +    def checkHiveHashForDecimal(
    +        input: String,
    +        precision: Int,
    +        scale: Int,
    +        expected: Long): Unit = {
    +      val decimal = Decimal.apply(new java.math.BigDecimal(input))
    +      decimal.changePrecision(precision, scale)
    +      val decimalType = DataTypes.createDecimalType(precision, scale)
    +      checkHiveHash(decimal, decimalType, expected)
    +    }
    +
    +    checkHiveHashForDecimal("18", 38, 0, 558)
    --- End diff --
    
    A quick question: these expected values are got from Hive?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73596/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103551662
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HashExpressionsSuite.scala ---
    @@ -371,6 +370,51 @@ class HashExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
             new StructType().add("array", arrayOfString).add("map", mapOfString))
           .add("structOfUDT", structOfUDT))
     
    +  test("hive-hash for decimal") {
    +    def checkHiveHashForDecimal(
    +        input: String,
    +        precision: Int,
    +        scale: Int,
    +        expected: Long): Unit = {
    +      val decimalType = DataTypes.createDecimalType(precision, scale)
    +      val decimal = {
    +        val value = Decimal.apply(new java.math.BigDecimal(input))
    +        if (value.changePrecision(precision, scale)) value else null
    +      }
    +
    +      checkHiveHash(decimal, decimalType, expected)
    +    }
    +
    +    checkHiveHashForDecimal("18", 38, 0, 558)
    --- End diff --
    
    The expected values are generated using hive 1.2.1. My original approach was to depend on Hive for generating expected values but [as per discussion in a related PR](https://github.com/apache/spark/pull/15047#issuecomment-247231919), I was suggested to hardcode expected values 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103618852
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala ---
    @@ -732,6 +743,51 @@ object HiveHashFunction extends InterpretedHashFunction {
         HiveHasher.hashUnsafeBytes(base, offset, len)
       }
     
    +  private val HiveDecimalMaxPrecision = 38
    +  private val HiveDecimalMaxScale = 38
    +
    +  // Mimics normalization done for decimals in Hive at HiveDecimalV1.normalize()
    +  def normalizeDecimal(input: BigDecimal, allowRounding: Boolean): BigDecimal = {
    +    if (input == null) {
    +      return null
    +    }
    +
    +    def trimDecimal(input: BigDecimal) = {
    +      var result = input
    +      if (result.compareTo(BigDecimal.ZERO) == 0) {
    +        // Special case for 0, because java doesn't strip zeros correctly on that number.
    +        result = BigDecimal.ZERO
    +      }
    +      else {
    --- End diff --
    
    changed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    **[Test build #73596 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73596/testReport)** for PR 17056 at commit [`2387515`](https://github.com/apache/spark/commit/2387515f8c3e2a7aeb5022e50891d1c534444a31).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    thanks, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103537394
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala ---
    @@ -635,6 +636,16 @@ case class HiveHash(children: Seq[Expression]) extends HashExpression[Int] {
       override protected def genHashBytes(b: String, result: String): String =
         s"$result = $hasherClassName.hashUnsafeBytes($b, Platform.BYTE_ARRAY_OFFSET, $b.length);"
     
    +  override protected def genHashDecimal(
    +      ctx: CodegenContext,
    --- End diff --
    
    where do we use `ctx` and `d`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103097875
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HashExpressionsSuite.scala ---
    @@ -371,6 +370,48 @@ class HashExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
             new StructType().add("array", arrayOfString).add("map", mapOfString))
           .add("structOfUDT", structOfUDT))
     
    +  test("hive-hash for decimal") {
    +    def checkHiveHashForDecimal(
    +        input: String,
    +        precision: Int,
    +        scale: Int,
    +        expected: Long): Unit = {
    +      val decimal = Decimal.apply(new java.math.BigDecimal(input))
    +      decimal.changePrecision(precision, scale)
    +      val decimalType = DataTypes.createDecimalType(precision, scale)
    +      checkHiveHash(decimal, decimalType, expected)
    +    }
    +
    +    checkHiveHashForDecimal("18", 38, 0, 558)
    --- End diff --
    
    I did a quick check. Most are right, but some of them do not match


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103384950
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HashExpressionsSuite.scala ---
    @@ -371,6 +370,48 @@ class HashExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
             new StructType().add("array", arrayOfString).add("map", mapOfString))
           .add("structOfUDT", structOfUDT))
     
    +  test("hive-hash for decimal") {
    +    def checkHiveHashForDecimal(
    +        input: String,
    +        precision: Int,
    +        scale: Int,
    +        expected: Long): Unit = {
    +      val decimal = Decimal.apply(new java.math.BigDecimal(input))
    +      decimal.changePrecision(precision, scale)
    +      val decimalType = DataTypes.createDecimalType(precision, scale)
    +      checkHiveHash(decimal, decimalType, expected)
    +    }
    +
    +    checkHiveHashForDecimal("18", 38, 0, 558)
    +    checkHiveHashForDecimal("-18", 38, 0, -558)
    +    checkHiveHashForDecimal("-18", 38, 12, -558)
    +    checkHiveHashForDecimal("18446744073709001000", 38, 19, -17070057)
    --- End diff --
    
    The main reason why not all of them match is because difference in how scale and precision are enforced within Hive vs Spark.
    
    Hive does it using its own custom logic : https://github.com/apache/hive/blob/branch-1.2/common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java#L274
    
    Spark has its own way : https://github.com/apache/spark/blob/0e2405490f2056728d1353abbac6f3ea177ae533/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala#L230
    
    Now when one does `CAST(-18446744073709001000BD AS DECIMAL(38,19))`, it does NOT fit in Hive's range and it will convert it to `null`... and `HASH()` over `null` will return 0.
    
    In case of Spark, `CAST(-18446744073709001000BD AS DECIMAL(38,19))` is valid and running `HASH()` over it thus gives some non-zero result.
     
    TLDR: this difference is before the hashing function comes into the picture. Making this in sync would mean the semantics of Decimal in Spark need to be matched with that in Hive. I don't think its a good idea to embark on that as it will be a breaking change plus this PR is not a strong reason to push for that. Hive-hash is best effort.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r103538494
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala ---
    @@ -635,6 +636,16 @@ case class HiveHash(children: Seq[Expression]) extends HashExpression[Int] {
       override protected def genHashBytes(b: String, result: String): String =
         s"$result = $hasherClassName.hashUnsafeBytes($b, Platform.BYTE_ARRAY_OFFSET, $b.length);"
     
    +  override protected def genHashDecimal(
    +      ctx: CodegenContext,
    --- End diff --
    
    They both aren't used but are a part of the method signature since the default impl in abstract class needs those : https://github.com/apache/spark/blob/3e40f6c3d6fc0bcd828d09031fa3994925394889/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala#L321


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    LGTM(if tests pass)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    **[Test build #73595 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73595/testReport)** for PR 17056 at commit [`2387515`](https://github.com/apache/spark/commit/2387515f8c3e2a7aeb5022e50891d1c534444a31).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    **[Test build #74018 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74018/testReport)** for PR 17056 at commit [`7c0b6c8`](https://github.com/apache/spark/commit/7c0b6c849bb2b3869a9c91560d130bb884e1532b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-...

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17056#discussion_r104327417
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala ---
    @@ -732,6 +743,48 @@ object HiveHashFunction extends InterpretedHashFunction {
         HiveHasher.hashUnsafeBytes(base, offset, len)
       }
     
    +  private val HiveDecimalMaxPrecision = 38
    +  private val HiveDecimalMaxScale = 38
    +
    +  // Mimics normalization done for decimals in Hive at HiveDecimalV1.normalize()
    +  def normalizeDecimal(input: BigDecimal, allowRounding: Boolean): BigDecimal = {
    --- End diff --
    
    removed that param


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by tejasapatil <gi...@git.apache.org>.
Github user tejasapatil commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    
    Jenkins retest this please
    
    last failure was weird termination of jenkins:
    
    ```
    [info] NaiveBayesClusterSuite:
    Traceback (most recent call last):
      File "./dev/run-tests-jenkins.py", line 226, in <module>
        main()
      File "./dev/run-tests-jenkins.py", line 213, in main
        test_result_code, test_result_note = run_tests(tests_timeout)
      File "./dev/run-tests-jenkins.py", line 140, in run_tests
        test_result_note = ' * This patch **fails %s**.' % failure_note_by_errcode[test_result_code]
    KeyError: -9
    [error] running /home/jenkins/workspace/SparkPullRequestBuilder@2/build/sbt -Phadoop-2.6 -Phive-thriftserver -Phive -Dtest.exclude.tags=org.apache.spark.tags.ExtendedHiveTest,org.apache.spark.tags.ExtendedYarnTest hive-thriftserver/test mllib/test hive/test examples/test sql/test sql-kafka-0-10/test catalyst/test ; process was terminated by signal 9
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17056: [SPARK-17495] [SQL] Support Decimal type in Hive-hash

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17056
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org